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A METHOD OF STUDYING MANNER OF GROWTH 


S. C. PEARCE 


East Malling Research Station, 
Maidstone, Kent, England 


In the study of growing organisms questions sometimes arise con- 
cerning the growth rates of individuals, whether, for example, they are 
correlated with size, are constant, or what else. It is here suggested 
that such questions can sometimes be answered by noting the relation- 
ship between the standard error and the mean of a group of developing 
organisms. 

Suppose, for example, that the individuals are developing in con- 
ditions such that all have the same percentage growth rate, then their 
relative sizes will remain constant, and so will the standard error of 
log (size). If, on the other hand, those that are initially larger grow 
more rapidly, the relative sizes will continually pull farther apart, 
and the standard error of log (size) will increase. This is a simple 
example, but one that can be extended to more complex cases. 

Three standard errors prove especially useful in studies of this 
kind: 

that of log(size) at time 

O10, that of the increment in log(size) between time 0 and time ¢, 

a1, that of log (size) at time ¢ after adjustment by covariance on the 
corresponding values at time 0. 


However, only two of these are independent, because if 
it follows that 


because, where p,o is the correlation coefficient between log (size) at 
times 0 and 
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and 


72 2 2 
Strictly this result applies only for sums of squares, but it may be used 
for variances also if degrees of freedom are many. 


Example 1 


To develop the example given in the second paragraph, suppose 
that each individual has its own growth rate, that for the 7th individual 
being g,; times the average of the group. Let its value of log (size) at 
time ¢ be L;, , the correlation coefficient between g; and L,. be p, and let f, 
be the mean difference between L;, and L, for the group. Then 


Lit = Lio + - 
Hence 
= + 2pfioo, + fio; ; 
where ga, is the standard error of the g,’s. Also, 
= fio, and = pf.o, . 


Consequently if oo and o/ are plotted against f, , each will give a 
straight line through the origin, the slopes being o, and o,+/1 — p” 
respectively. If o, is plotted also, its initial slope will have the same 
sign as p, the value being in fact po, . 


Example 2 


In practice, of course, the problem is usually one of suggesting what 
growth laws could have given rise to the graphs, the shapes of which 
are known empirically. Thus, in Figure I are shown the curves for 
o: , Go , and a? for four sets of apple trees, the size being measured by 
trunk circumference. Trial I was a 2° NPK factorial manurial trial 
of Cox’s Orange Pippin bud-grafted on the rootstock Malling I. The 
effect of potash was very marked, and in consequence the data were 
examined in two parts, Trial IA representing the 32 trees receiving 
high dressings of potash (eight blocks of the four treatment-combi- 
nations of nitrogen and phosphate), and Trial IB the remaining 32 
trees, which had low potash dressings. Standard errors were calculated 
from the error line after allowance had been made for blocks and treat- 
ments. Trial II was designed with eight trees to a plot, four of Cox’s 
Orange Pippin bud-grafted on Malling IX, and four of Beauty of 
Bath similarly worked. These two varieties were regarded as forming 
Trials IIA and IIB respectively, the standard errors being those of 
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trees within plots. Certain trees were excluded as being damaged, 
insufficiently protected from wind, or otherwise atypical, and the errors 
had 49 and 56 degrees of freedom respectively. 


TRIAL IA Ps TRIAL IB 


h 


TRIAL ITA TRIAL 


oot 1 1 L 4 1 1 
he 
FIGURE I 


GRAPHS OF , AND AGAINST f; FOR THE FOUR TRIALS. 


The curve for ¢: is the unbroken one, that for o1o is singly-broken, and that for o¢’ doubly-broken: 
The scales are such as to make go equal to one, 
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The curves for Trials IA and IB extend from planting at the end 
of the 1930 growing season till the dormant season of 1947-8; those 
for IIA and IIB extend from 1932 till 1948-9. 

Examination of the curves suggests the following steps in building 
up a mathematical model. 

(1) At the beginning the values of ¢, are falling. This shows that 
the initial growth rates were negatively correlated with initial size. 

(2) There is a tendency for the curves of o, and a} to touch, i.e., 
there comes a time when the sizes of individuals are no longer correlated 
with their initial sizes. 

(3) Having touched, the curves of co, and o/ do not separate again. 
If growth rates were still correlated with initial size, they would do so. 

(4) Even after the curves of o, and o/ have fused, that of a9 con- 
tinues to rise. It follows that individuals still have different growth 
rates, though the initial ones no longer apply. 

(5) The curves for o, and o/ are not straight, which confirms that 
growth rates have not been constant. 

The simplest hypothesis appears to be that the initial growth rates 
9g; give way to subsequent growth rates h, all trees changing over 
together, so that at time ¢ the growth rate of tree 7 is (1 — ¢.)9; + @h. , 
where ¢, , which is the same for all trees, increases from 0 to 1. If this 
is so, the growth made at any time can be considered in two parts, F, 
under the initial growth rates g; and (f, — F,) under the subsequent 
ones h;. In fact, 


Li = Lio + + F 


If the subsequent growth rates are correlated with neither initial size 
nor initial growth rates, it follows that 


to = Fro, + — , 


and 


F 
where o, is the standard error of the h,’s. 


Adequacy of the model 


With such a model the initial slope of the curve for a; will be po, , 
for that of o, it will be o, , and for that of o/ it will be o,+/1 — p’. 
These values will apply for as long as F, equals f, ; when, however, it 
attains a constant value on account of the initial growth rates making 
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no further contribution, the slopes will be different, all having an 
asymptotic value of o, , but those of o, and o/ being greater than that 
of oo . The shapes of the curves between these two extremes, i.e., 
F, = f, , and F, constant, will depend upon the manner in which the 
initial growth rates give way to the subsequent ones. 

Examination of the graphs of Figure I shows that they have been 
reasonably well described. There remains, however, the possibility 
of there being further correlations to take into account. First, suppose 
that h; and Lj. have a correlation coefficient of p’. Then y will equal 


F .po, + (fe F 


and may be expected to increase or decrease indefinitely according to 
the sign of p’. In Table I are set out its values for different ages from 
planting, and it will be seen that they increase to a limit; hence p’ must 
be small. 


TABLE I 
VALUES OF — y FOR THE Four TRIALS 
(The Higher Age Applies to Trials IA and IB, the Lower to IIA and IIB) 


Age IA IB IIA IIB 
2or3 0.036 0.035 0.015 0.023 
4or5 0.048 0.045 0.024 0.031 
8or9 0.053 0.051 0.024 0.038 

12 or 13 0.058 0.052 0.023 0.040 
16 or 17 0.061 0.052 0.023 0.039 


This leaves open the possibility of an appreciable correlation between 
g, and h,. If there were one, extension of the mathematical model does 
not suggest that the curves need be very different, so the possibility 
remains, 

A criticism that could reasonably be made concerns the assumption 
that all trees change over together from their initial growth rates to 
their subsequent ones. Probably they are not perfectly in step, but the 
model as at present formulated leaves open the whole question of how 
the initial growth rates give way to the subsequent ones, so the matter 
can hardly be put to the test. 


Biological interpretation of the model 


The model arrived at is a reasonable one. Plainly plants do differ 
in their sizes at planting, and the smaller ones often “get away better’. 
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The subsequent growth rates must result in part from positional differ- 
ences in the field, and, since the trees were allocated to their positions 
at random, it is not surprising to find initial size and subsequent growth 
rates unrelated. 

The model has been tested in other ways and appears to be sound. 
It is not suggested that an examination of the curves is itself sufficient; 
the advantage of the method is rather that by plotting certain standard 
errors, which may well be wanted anyway, it enables mathematical 
models to be formulated for test, or shows that those already formulated 
are plainly inadequate. 


Summary 


In a group of developing organisms the relationship between such 
quantities as the standard error of log (size) and the mean increment 
in log (size) will sometimes suggest a law of growth. This is illustrated by 
some results with apple trees. 
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ANALYSIS OF COVARIANCE FOR A 3 X 4 TRIPLE 
RECTANGULAR LATTICE DESIGN (3 ASSOCIATE P.B.I.B.) 


BERNARD S. PASTERNACK 


Department of Biostatistics, University of North Carolina 
Chapel Hill, North Carolina, U.S. A. 


1, INTRODUCTION 


The technique of analysis of covariance which was first introduced 
by R. A. Fisher [1934] has received detailed appraisal and extensive 
broadening of scope in a special issue of Biometrics published in Septem- 
ber 1957. In that issue the nature and principal uses of the analysis 
of covariance are discussed with exceptional clarity in an article by 
W. G. Cochran [1957]; H. F. Smith [1957] offers an exposition on the 
theory, interpretation, and relationship of analysis of covariance with 
regression analysis; and the general case of r concomitant variables 
with special reference to incomplete block designs—both intra- and 
inter-block analyses included—is presented by M. Zelen [1957]; other 
papers appearing in that special issue include those by W. T. Federer 
[1957], G. N. Wilkinson [1957], D. J. Finney [1957] and I. Coons [1957]. 

B. Harshbarger [1946, 1947, and 1949] developed a set of incomplete 
block designs in which the number of varieties (or treatments) is the 
product of two consecutive integers and the number of replications of 
each variety is either 2 or 3 and their multiples. He called them simple 
and triple rectangular lattices respectively. An early report on the 
analysis of such designs was put forth by Robinson and Watson [1949]. 
A few years later K. R. Nair [1951, 1952] showed that a 3 X 4 triple 
rectangular lattice is a 3-associate partially balanced incomplete block 
design. V. N. Murty [1953] has given a numerical illustration of the 
analysis for such a design following the methods of K. R. Nair. 

It is not generally realized, as pointed out by M. Zelen [1957] and 
others, that regardless of the experimental incomplete block design in 
question the problem of analysis of covariance is directly dependent 
upon the solution of the ordinary problem of analysis of variance for 
that design. Hence, once the analysis of variance solution is obtained, 
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the corresponding problem of analysis of covariance is all but solved 
(except for additional computations). The primary cause of confusion 
that some research workers experience in the analysis of covariance of 
an incomplete block design apparently involves uncertainty in the 
calculation of the adjusted sum of products due to treatments. The 
following presentation is intended to clarify this issue with the belief 
that the interested reader will have little difficulty in adjusting the 
procedures illustrated in this paper to other incomplete block designs. 

The main objective, then, of this paper will be to illustrate the compu- 
tations that are necessary in order to extend intra-block analysis of 
variance for a 3 associate partially balanced incomplete block design 
to intra-block analysis of covariance for that same partially balanced 
incomplete block design. A discussion will follow illustrating the 
uncertainty of just what effects may be represented by either error 
(“internal”) regression, or treatment (‘external’) regression. The 
incomplete block design under consideration will be the following 
3 X 4 triple rectangular lattice. 


2. ANALYSIS OF COVARIANCE ILLUSTRATED ON A 3 X 4 
TRIPLE RECTANGULAR LATTICE DESIGN 


The 12 varieties may he written in the form: 


412 423 431 x 


Note that with respect to any variety all the varieties which appear 
with it in the same block have 2 digits alike, and among those varieties 
which do not appear with it in the same block some have 2 digits alike 
and others have 3 digits alike. 


1 2 3 
x 124 132 143 
4 5 6 
213 xX 234 241 
8 9 | 
314 321 x 342 
1 
+ 


ANALYSIS OF COVARIANCE 


= “a + "a = 


199°06 = “y + “ay + "y = "9 


66 (¢) 02 (2) (1) (2) OLF'S (T) rat 
06 (11) 9% 190°6 090°F (IT) 02¢°% (F) II 
OF (6) 89 (OT) cr (2) 1¥0' OT 062°2 (6) €08°% (OT) (Z) Or 
98 (ZI) cr (8) og (9) (ZT) (8) (9) 6 
III ‘doy III ‘doy 
86 6z (6) (9) GZ (€) 028° (6) €08°% (9) Z01°S 8 
£6 08 (21) GZ 88 (2) 289°% (21) (¢) 669°% (Z) 
90T OF (TT) (8) 0g (1) €89'F (IT) 996°T (8) 660°T (1) 9 
z8 OF (OT) 02 (2) Zo (F) 120°F (OT) (2) OST'T (F) 
II ‘doy II ‘doy 
ltt 
921 98 (ZI) OF (IT) 0¢ (OT) (Z1) (IT) (OT) 
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ae aquin 
‘doy ‘doy yoorg 
aad sjuB[g Jo Jaquinn sod Ul Jo pjalx 
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0 0 0 0 0 0 
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668 6EE G| OT 9 | 966°9— | CIO ITT 6 "9 FIO 6I | 8hF'8 8 
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If these groups are called first, second, and third associates respectively, 
the above triple rectangular lattice is a p.b.i.b. design. (See Nair 
[1951] and Murty [1953]; the following notation has been chosen to be 
consistent with Murty [1953]). 

For this design 


v= 12,b=12,r=3,k =3 


where 
v = number of varieties, 
b = number of blocks, 
r = number of replications of each variety, 


k = number of varieties occurring in each block. 


The matrix 
Ais Bis Cis 6 0 
Bs; C.3| = 7 
Ass Css 3 1 9 


F = B,3C33 — = 64, 
G = B;;C,3; — By3C33 = 9, 
H = B,;C.; — B.3Ci3; = 1, 
and it can be seen that the determinant A of this matrix is 


A = A,3F + A23G + A;;H = 360. 
(a) = Total yield of the variety. 
(8) = Total of the blocks in which the variety occurs. 
(vy) = k-(a) — (6). 
(Q) = (y)/k. 
(6,) = First associates of the variety. 
(e,) = Sum of (7) for varieties in (6,). 
(62) = Second associates of the variety. 
(€.) = Sum of (7) for varieties in (6,). 
(6;) = Third associates of the variety. 
(e,) = Sum of (y) for varieties in (63). 
(n) = Fy + Ge + He = 647 + 9a +e. 
(é.) = (n.)/A. 
(é,) (n,)/ A. 
(é) t+ Grand Mean = Adjusted Varietal Mean. 


S.S. due to varieties = }“(y)(n)/kA = 
(eliminating blocks) 
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Now, the sum of squares (x*) due to varieties eliminating blocks is 
equal to 
Tes = = (t.)(Q.) = 1721.17" 
and the sum of squares (y*) due to varieties eliminating blocks is 
Ty = = (t,)(Q,) = 9.621259**. 


The corresponding sum of products (ry) due to varieties eliminating 
blocks is given by 


Ty = = (t)(Q.) 
= = (t)(Q,) = 71.589850. 
In similar fashion [Total 8.S. and 8.P.] 
>. 2 — (G2/n) = 3626.31, 
S, = Dy — GM) = 25.019103, 
S., = >> zy — (G,G./n) = 143.538917, 
and [S.S. and 8.P. due to blocks ignoring varieties] 
B.z = (B:/k) — = 1498.31, 
B,, = > (B3/k) — (@/n) = 11.290912, 
B., = > (B,B,/k) — (G.G,/n) = 70.369917. 


Finally, the sum of squares and sum of products belonging to error 
are obtained by subtraction. That is 


E,, = S,, — T,, — B,, = 4.106932, 
E., = Sz, — Toy — Bey = 1.579150. 


Also, we will have need for the following expressions (the notation 
will be kept consistent with that of Smith [1957]) 


b, = T.y/Ts2 = treatment regression of y on 2, 
bs = E.,/E.2 = error regression of y on 2. 
The estimated error variances for these two estimates are 


var (67) = T*/ — 2)T,, and var (b,) = E*/(n, — 


*From Table la. 
*From_ Table 1b. 
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v = total number of treatments in the experiment 


nm, = the number of degrees of freedom belonging to error. 
T* = T,, — brT.y = Ty — Y2 = sum of squares for treatments 


adjusted by 6, . 
E* = E,, — beE,, = E,, — Y; = sum of squares for error adjusted 
by be 
Y? = T,,62 = square due to treatment (“external”) regression. 
Y2 = E,,62 = square due to error (“internal”) regression. 
Y{ = square due to regression for pooled treatment and error. 
= T,2E.(67 — + = + Yi — Yi = square 
TABLE 2 
ANALYSIS OF CONVARIANCE OF STANDARD YIELD OF CorN 
Witu Test OF SIGNIFICANCE 
Sums of Squares & Products Errors of Estimate 
Source of Sums of 
Variation |D.F.| (x?) (xy) (y?) Squares |D.F.| MSS. 
Total 35 | 3626.31 143.538917 25.019103 
Blocks (ig- 
noring 


varieties)} 11 | 1498.31 70.369917 11.290912 


Varieties 
(elimi- 
nating 
blocks 11 | 1721.17 71.589850 9.621259 


Error 13 406 .83 1.579150 4.106932 | 4.100802} 12 | .341734 


Varieties 
& Error | 24 | 2128.00 73.169000 13.728191 |.11.217051| 23 


Variety sum of squares adjusted for blocks and 
concomitant variate (reduced treatment S. S.). 7.116249 | 11 | .646932 


0.646932 


0.341734 1.89, whereas Fy:,12(.05) = 2.72. 
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due to “heterogeneity” of treatment and error regressions. 
Y;, °°: , Y, = (v — 2) other linear functions of y such that 
doi-s Yi = T*, i.e. the sum of squares for deviations of treat- 
ment means from treatment regression, and ; a Y? = re- 
duced treatment sum of squares. 


In this design the estimated error variance for the difference of 
two estimated treatments, say é; and é; , is given by 


E* 


var (é; = i;) = var = +- E 1 


where 


2k(F —G)_ E* 


var (é;,, — é;.,) = = 11/12(.341734) = .313256 


A Ne 1 
if ¢ and j are first associates; 
= 
= —  _ 91 /20(.341784) = .358821 
A 
if ¢ and j are second associates; 
2kF E* 
16/15(.341734) = .364132 
TABLE 3 


ComPpuTATION OF TREATMENT MEans (£) ADJUSTED FOR 
BLOocKSs AND THE CONCOMITANT VARIATE* 


Treatment 

No. i & 

1 .2857 .5389 2836 2.8017 
3 .9992 2.8972 9882 3.5063 
3 — .9698 —6.9361 — .9434 1.5747 
4 — .7229 —11.0361 — .6808 1.8373 
5 .4173 —5.4111 .4379 2.9560 
6 — .3412 5.0722 — .3605 2.1575 
7 — .4326 —12.6028 — .3845 2.1336 
8 -9428 6.7639 .9170 3.4350 
9 — .7820 —5.6111 — .7606 1.7575 
10 .0068 16.1389 — .0548 2.4632 
11 .4812 8.5222 -4487 2.9667 
12 .1155 1.6639 . 1092 2.6272 


*, =f+ 0,7 = — 0.003882 , and g = 90.651/36 = 2.5181. 
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if ¢ and j are third associates; 
and 


E* 
i... n,— 1 


depends upon the particular pair of treatments (7, 7). 


3. DISCUSSION 


The following discussion, somewhat modified, is extracted from the 
University of North Carolina Institute of Statistics Mimeo. Series No. 
156 entitled Analysis of Covariance: A review—by H. F. Smith. The 
analysis of covariance for this experiment was made available to 
Professor Smith for use as an example to illustrate just what effects 
may be represented by regressions, either internal or external. For 
this discussion the relevant parts of this analysis are shown in Table 4. 


TABLE 4 
EXTENDED ANALYSIS OF CONVARIANCE 


Source of Variation D.F. SS. M.S. F 
Variety regression (Y 7) 1 2.9777 2.9777 | 4.48(P = .059) 
Variety vs. error 

regression (Y2) 1 .4726 
Deviations from variety 

regression (7'*) 10 6.6436 .6644 


Variety S.S. adjusted for 
blocks and con- 
comitant variate 11 7.1162 .6469 | 1.89(P > .10) 


Error regression (Y 7) 1 .0061 


Deviations from error 
regression 12 4.1008 .3417 


br = 0.0416, be = 0.0039, var(by) = 0.000386, var(bg) = 0.000840, s(6r) = 0.0197, 
s(bg) = 0.0290. 95% C.L.: —0.002 < br < 0.086, 95% C.L.: —0.059 < bg < 0.067. 


The key point at issue is how to utilize the available data on number 
of plants per plot in the analysis of this experiment. Three situations 
can be visualized: 


| 
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(i) If the number of plants per plot is a reflection of the varying 
fertility, then it would presumably be permissible to use a conventional 
analysis of covariance. 

(ii) If the number of plants per plot is a reflection of the fact that 
some varieties germinate better than others, then one would not use a 
conventional covariance analysis, but rather a usual analysis of variance 
with perhaps a breakdown of the treatment sum of squares which could 
be accounted for by the number of plants per plot. 

(iii) The third situation could be a combination of (i) and (ii), i.e. 
the number of plants per plot is both a reflection of the different fertilities 
of the plots and that some varieties germinate better than others. 

Initially, we observe that varieties show significant differences in 
yield and highly significant differences in plant number. One worker 
looking at the remainder regression alone might come to the conclusion 
that plant density had no measurable effect on yield. Another, on the 
other hand, after performing an analysis of covariance, and noting that 
the adjusted yields of the varieties do not differ significantly, could 
conclude that the original yield responses may be attributed to plant 
numbers. 

Now the variety regression, when compared to the variance of 
variety means about it, yields a value of F = 4.48, which may be con- 
sidered a real response since the probability of observing a value of 
F this large or larger is approximately P = .059. The reduced variety 
sum of squares has become non-significant as a result of (1) a third of 
the original variety sum of squares is associated with plant number 
leaving deviations about the variety regression non-significant, and (2), 
since variation of x within varieties and blocks is low, the internal 
regression is not determined accurately enough to demonstate dis- 
crepancy with the variety regression even when evaluated against 
internal error. 

Suppose we assume that, mean square about variety regression 
being greater than error variance, there may be some variety effects 
on yield although not large enough to be demonstated. If we discount 
regression, however, the following information would remain concealed. 

The upper 95 per cent confidence limit of bz is 0.067, therefore the 
variety response is well within a region of possible response to plant 
number not contradicted by the evidence from replicates. If it is 
known that varieties were actually sown with equal numbers of seed 
it must be concluded that varieties vary both in germinating ability 
(at least for the seed stocks used in this experiment) and yield. Thus 
situation (iii) visualized earlier seems to prevail. However, this ex- 
periment gives no clue to decide whether the variation in yield is mostly 
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environmental response to plant number after that has been established, 
or that growing vigor is associated with germinating potential and would 
establish the yield effect independently of plant number. 
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COMPETITION IN POPULATIONS CONSISTING OF ONE AGE 
GROUP* 


Sgcaarp ANDERSEN 


Statens Skadedyrlaboratorium, Springforbi, Denmark 


When animal populations do not increase indefinitely it is thought 
to be due to interaction (competition s. lat.) between single organisms. 
This interaction may be interspecific or intraspecific. 

Intraspecific interaction may occur between age groups (or develop- 
mental stages) and within age groups. 

In insects e.g., interaction may occur between larvae and eggs, 
between adults and pupae, etc. The dependence of these interactions 
on the population density has seldom been subject to experiments. The 
best known example is the interaction between males and eggs of the 
flour beetles Tribolium and Oryzaephilus, viz. the cannibalistic egg 


eating investigated by Crombie [1943]. His results may be described 
by the equation: 


c = kiz/m, 


where c is the number of eggs consumed, 7 the number of insects, x the 
initial number of eggs, m the amount of flour, and k is a constant. 
Under more natural conditions this interaction is complicated by 
tunnelling (Park [1933], Stanley [1949]) and by more intensive egg 
eating by females than by males (Rich [1956)). 

Experiments describing the effect of the density on the interaction 
within a single age group are far more numerous. Some of these are 
designed to investigate the interaction between adults as measured by 
its influence upon the fecundity, and other papers deal with the inter- 
action within a single age group during its total development from newly 
hatched larva to pupa or adult. Papers dealing with such experiments 
are so numerous that it seems worthwhile generalizing the results. 

The interaction within a single age group may be measured by the 
death rate, the birth rate, the sex ratio, the rate of development, and 
the rate of movement, all expressed as functions of the density. 

Death rate. The fraction dying (or surviving) during the pre- 


*Contribution to the 4th International Biometric Conference, Ottawa, 1958. Publication was 
supported in part by a grant from the United States National Science Foundation. 
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imaginal development has often been found to be linearly related to 
the initial number of larvae: 


x;/Xo bx (1) 


where 2p is the initial number of larvae, x; the number surviving to the 
imaginal stage, and a and b are positive constants. 

In an earlier review (Andersen [1957]) this was found to be the case 
with some Diptera and Cladocera. If the animals die as a result of 
mutual interference, it seems reasonable that the probability of their 
death (= the mean fraction dying) is proportional to the total 
number, but the logical background of the relation is much more compli- 
cated (ef. loc. cit.). 

At very small densities the mortality is independent of the density, 
and also for very high densities the graphs often show a tendency to 
bend towards parallelism with the X-axis. Thus empiric graphs are 
often slightly sigmoid. In some moths such as Tineola (Titschack 
[1937]) and Endrosis (Andersen [1956]) this tendency is more pro- 
nounced, and it is not clear whether it is inherent in the animals or 
whether it is due to variation in the breeding medium and/or in the 
age of the animals used for the experiments (cf. below). 

Differential mortality. If a population contains two genotypes, 
these may show differential mortality. A model of the effect of density 
upon differential mortality may be built on the following three assump- 
tions: 

(a) The surviving fraction of the total population is linearly related 
to the density according to equation (1) above: 


x;/Xo =a- 


(b) The ratio of the surviving number (v) of one genotype to the 
surving total is also linearly related to the initial total, thus: 


v/x; = g — hx (g and h are positive constants). 
(c) The initial ratio between the two genotypes is constant, thus: 
Vo/Xo = 


(vo being the initial number of one genotype, and p a constant). 
On these assumptions the surviving fractions of the two genotypes 
will be equal to second power polynomials in the initial total: 


= (a — bao)(g — hao)/p 


= 1— + nz, 


where I, m, and n are positive constants. 
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Graphically, with the initial total as abcissa, the surving fraction 
of the two genotypes will be described by parabolas, one on each side 
of, and concave towards the straight line describing the total surviving 
fraction. They intersect at two points on this line, viz. at zero survival 
and at very low density. 

This model fits the result of the experiment of Béggild and Keiding 
[1958, Fig. 4] with equal numbers of a DDT-resistant and a DDT- 
susceptible strain of the house fly (Musca domestica). 

In Titschack’s experiment with Tineola mentioned below in con- 
nection with the sex ratio, assumptions (b) and (c) hold, but because of 
unfavorable spacing of the densities and a rather sigmoid graph for the 
total mortality, it is not fit for testing this model. 

Larval cannibalism. In Chrysomyia albiceps the larvae are cannibal- 
istic, and this species does not follow equation (1), (ef. Ullyett [1950], 
Fig. 17). However, the logarithm of the surviving fraction is linearly 
related to the initial number of larvae: 


log (x;/%) = a — bao 


[symbols defined for equation (1)]. 

This leads to the interesting consequence that the probability of a 
certain larva dying in the interval dt at the time ¢ is not proportional 
to the surviving number of larvae at the time i minus the one in question 
(x, — 1), but possibly to the initial number of larvae (x). This is 
shown by the following reasoning: 

If we assume (a) that if two larvae meet, there is a constant prob- 
ability (m) that one of them die, and (b) that the probability of one 
larva meeting one of the other larvae is proportional to their number, 
or equal to n(x — 1), where x is the number and 7 is a constant, then the 
probability of a certain larva dying during the interval dé at ¢ is 


where k = mn. 
On this assumption the probability that a certain larva is alive at 
the time ¢ + dt is 


P(t + dt) = — k(x, — 1) dé], 


P(t) = exp [-f k(x, — 1) ar| 


where P(t) is the probability that the larva in question is alive at the 
time 
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However, the mean number (#,) surviving to the imaginal stage is: 
Z; = = exp E k(x, — 1) at| 
t=0 


and, if the duration of the development (7) is independent of 2, this 
leads to the equation: 


1/%; = p/m. +4 (p and q being constants), 


which is clearly inconsistent with Ullyett’s observations. 

The errors in our assumptions (a) and (b) above are obviously that 
m and n are not constant, because as the larvae grow they move faster 
and are perhaps more dangerous to one another. 

If, however, we assume that m and n vary in such a way that the 
probability of a certain larva dying during the interval dt is 


(a + bao) dt, 
then we get 


v 


exp [—(a + 
and 


= exp [—(a + bz,)i], 


which fits Ullyett’s observations on the realistic assumption that the 
duration of the development (7) is constant. 

It must be noted that although this model gives a satisfactory 
description of the variation of x; with varying 2» , it does not necessarily 
describe the variation of xz, with varying time (from egg to pupa or 
adult). This variation remains to be described by future investigations. 

Fecundity. The number of eggs per female is often linearly related 
to the reciprocal of the number of animals per unit of food, which is 
in turn proportional to the amount of food or the number of oviposition 
sites per animal (Andersen [1957]): 


y=atc/x, (2) 


where y is the number of eggs per female, a and c are positive constants, 
and 2) may have two different meanings: if the equation describes the 
interaction between immature stages, 2, is the initial number of larvae 
per unit of food, but if it describes the interaction between ovipositing 
females, 2) is the number of adults per unit of food. 

When the interaction occurs only between immature stages, the 
number of eggs per female is often proportional to the female weight, 
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and in such cases equation (2) may be used for a description of the 
effect of the density upon the female (or pupal) weight. 

In some grain and seed pests (Andersen [1957], p. 18) the equation is 
complicated by a linear term accounting for the effect of mutual inter- 
ference with oviposition: 


y=a-— be+ec/z 


where x is the number of animals per unit medium, b is a positive 
constant, and the other symbols are as defined in equation (2). 

Sez ratio. In animals with chromosomal and random determination 
of the sex and heterogametic males, the sex ratio is not affected by the 
density or the amount of food. Even much cited examples such as 
that of Tribolium confusum (Holdaway and Smith [1933]) break down 
for a statistical consideration. 

In the clothes moth, Tineola biselliella, Titschack’s [1937 and 
1922] observations show a linear relation between the proportion of 
females and the initial number of larvae per unit of food. This effect, 
which is obviously due to differential mortality, has not been significantly 
demonstrated in other Lepidoptera although odd sex ratios are common 
in this order of insects. 

In some parasitic Hymenoptera sex determination is chromosomal, 
but non-random by way of arrhenotokous or amphitokous partheno- 
genesis. In T'richogramma the proportion of females decreases approxi- 
mately linearly with the increasing number of females (Salt [1936]), 
whereas in Alysia it increases linearly with the volume of the host 
puparia (Holdaway and Smith [1932]). In the first case the sex ratio 
is a function of the density of the parasite, but in the second case it is a 
function of the density of the host. 

When the sex determination is phenotypical the proportion of 
females may be linearly related to the density as found by Banta and 
Brown [1929] for Moina (Cladocera), but in other cases, viz. the plant 
parasitic nematode Heterodera (Ellenby [1954]) and the animal parasitic 
nematode Mermis (Christie [1929]), linearity could not be tested because 
of the big variance of the observations. 

A detailed critical review of the literature on the effect of density 
on the sex ratio will be published elsewhere. 

Rate of development. In some animals (e.g. Musca, cf. Béoggild 
and Keiding [1958]) the rate of development is independent of the 
density. In others the rate decreases with increasing density (e.g. 
Endrosis, cf. Andersen [1956]). 

In order to build a mathematical model describing the dependence 
of the rate of development on the density, it is necessary to consider 
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the type of distribution (log-normal, reciprocal-normal, or otherwise) 
of the duration of development, because it will be necessary to work 
with some sort of mean duration. It is my intention to return to this 
subject using unpublished observations on Endrosis as well as making 
a critical review of the literature. 

Movements. It is known that in some cases movements may be 
dependent on density. Adults and larvae may move away from points 
of high density, and egg-laying females of parasites and grain pests 
may move away from points at which hosts or kernels already harbour 
eggs or larvae. Although no review has been published, it can safely 
be stated that experiments are much needed on these phenomena. 

Beneficial effect of density. In most animals one or more of the 
foregoing regulations are at work above a certain minimal density. 
However, in some Lepidoptera (e.g. Plusia gamma and Pieris brassicae) 
rather intensive crowding is beneficial, as the duration of development 
and the mortality decreases with increasing density (Long [1953]). 
This becomes comprehensible when noting that these Lepidoptera are 
migratory in England, where the experiments were carried out (Williams 
et al. [1942]); regulation may therefore take place during migration 
or in the winter quarter. 

An analogous phenomenon may possibly be found in sea birds 
breeding in colonies. 

Interaction within one age group in nature. In most of the experiments 
cited above the populations consisted of individuals of one age only. 
At first sight this may seem a case rarely found in nature, but with 
closer consideration it is found to be common. Many insects emerge 
and lay their eggs almost on the same day, so that the next generation 
consists of larvae of practically equal age. Also in many populations 
of fish nearly all eggs are spawned during a short period, and in several 
species of birds all individuals start ecologically equal each spring 
with the exception of just the very young and very old ones, and this 
is also the case with many other perennial animals. 

On the other hand, all age groups are as a rule interacting in animals 
with many litters a year (e.g. small rodents), and in insects with several 
generations during the summer. 

Future investigations. Experiments designed to describe the inter- 
action between individuals in populations consisting of a single age 
group can be expected to fit the models given above only if the medium 
is strictly homogeneous, and cultures are so small that gradients of 
temperature, humidity, etc. are not formed, and only if the animals 
entering the experiments are exactly of equal age, genotype, and con- 
ditions in the past. , 
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It will therefore be very important to refine the experiments on 
these points and observe whether deviations from the models (e.g. 
the sigmoid flexure) diminish by so doing, and also to complicate 
the experiments on the same points to show which deviations are 
produced. 

Another important field of future investigations concerns the dis- 
tribution in time of the effect of density within a single age group. 
Experiments should be started with newly hatched larvae, as were 
most of those cited above, but should be interrupted after varying 
periods, and from then onwards the larvae should be allowed to complete 
their development singly and with excess of food. Likewise experiments 
should be started with larvae of a single age group reared singly with 
excess of food, and then crowded at varying ages in order to see what 
effect density has from this age onwards. Such experiments will illumi- 
nate many problems, including the very important one concerning the 
delayed action of crowding (time lag). They will also form the basis 
of experiments with interaction between age groups, and not until, in 
a distant future, these have been carried out, will it be possible to 
deduce the sigmoid growth curve resulting from intraspecies interaction. 

Summary. Based on the rather voluminous literature on competition 
in populations consisting of a single age group, the following math- 
ematical models and descriptions are given: 

(1) When larval cannibalism is not involved, the fraction surviving 
to the imaginal stage is often linearly related to the initial number of 
larvae. 


(2) In the case of larval cannibalism the logarithm of the surviving 
fraction is linearly related to the initial number of larvae. 


(3) In the case of differential mortality of two genotypes composing 
a population, the mathematical model is graphed as a straight line for 
the total surviving fraction, and two parabolas concave towards it 
for the surviving fraction of the two genotypes, the abcissa being the 
initial number of larvae. Also the ratio between the surviving number 
of one genotype and the surviving total forms a straight line against 
this abcissa. 


(4) In the case of competition between larvae, the number of eggs 
per female (and the female weight) is linearly related to the reciprocal 
of the initial number of larvae, and 

(5) in the case of competition between adults the fecundity is often 
linearly related to the reciprocal of the number of adults, but 

(6) in some grain and seed pests the equation is complicated by the 
introduction of a linear term for mutual interference. 
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(7) In some rare cases the sex ratio is influenced by the density. It 
is linearly related to the density 


(a) in Tineola with chromosomal and random determination of 
the sex, 


(b) in a hymenopterous parasite with chromosomal, but non- 
random determination of the sex, 

(c) in Moina (Cladocera) with phenotypical determination of the 
Sex. 


Critical reviews have not yet been made of the literature dealing 
with the effect of density on the duration of development and on animal 
movements. 


A hypothetical explanation is given of the beneficial effect of larval 
crowding in some Lepidoptera. 

Populations consisting of a single age group are thought to be 
common in nature, but so are those composed of many age groups, and 
future investigations must deal with competition in such populations 
as well as the distribution in time of the effect of density in populations 
consisting of a single age group. 
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A SYNTHESIS OF MULTIVARIATE TECHNIQUES TO 
DISTINGUISH PATTERNS OF GROWTH IN 
GRASSHOPPERS'” 


R. E. Buackrru 


Imperial College Field Station 
Sunninghill, Berks., England 


INTRODUCTION 


The discriminant function which best separates the means of two 
groups of well-defined and objectively distinguishable organisms is, 
in effect, a vector which expresses the contrast between the patterns 
of growth of the organisms: when the two groups are polymorphic 
forms of a species, the generalised distance associated with the dis- 
criminant function is a measure of the polymorphism. This generalised 
distance represents the efficacy of the discriminant function in separating 
the two groups in a space of as many dimensions as there are characters 
measured (Rao [1952]). Furthermore, when the characters which 
enter into the discriminant function are the dimensions of parts of 
the body or of its appendages, the vector then represents a change of 
shape, adequately represented in proportion to the extent to which 
the chosen suite of characters covers the main parts of the organism. 
A description, with worked examples, of the way in which such suites 
of characters may be combined into a single measure of differences of 
shape is given by Rao [1952]. 

When more than two groups are compared in this way, the dis- 
criminant functions which link them to one another may differ in 
direction as well as in length, and the generalised distance between the 
positions of the means of each group can be used as a measure of the 
degree of likeness of one shape to another, whereas the direction of 
the discriminant functions reveals the qualitative distinction between 
dissimilar changes of shape. In this way each vector may readily be 
given its appropriate biological identification, although it may not be 
so easy to determine the number of different kinds of polymorphism 


1Expanded version of a paper read to the IVth International Biometric Conference, Ottawa, 
Sept. 1958. * 


2 Publication was supported in part by a grant from the United States National Science Foundation. 
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which happen to have been elicited in a group of organisms by this 
means. These techniques of morphological integration have a special 
interest for quantitative taxonomy and evolutionary studies, many 
of the implications being discussed by Olson and Miller [1958]. 

The discriminant function is by no means the only way in which 
the characters can be compounded. Whereas the discriminants maxi- 
mise the distance between the members of a pair of groups of organisms, 
there is the wider problem of maximising the dispersion of some larger 
number of groups in a space, or rather hyperspace, extending over 
several dimensions of variation. Rao [1952] describes the construction 
of the orthogonal axes which delineate this multidimensional space, 
axes which are often called canonical variates. Once these axes have 
been computed, the mean position of each group can be located in 
relation to this framework of reference. Even though the canonical 
variates may not bear so readily identifiable an interpretation as do 
discriminant functions, a test for the number of significant dimensions 
of variation is then available. 

The relationship which an analysis along canonical axes bears to 
one employing discriminant functions has much in common with that 
which the analysis of a supposedly homogeneous group of organisms 
along the latent vectors of its dispersion matrix bears to certain types 
of factor analysis. This is so at least in the sense that latent vectors 
and canonical variates preserve orthogonal axes, whereas discriminant 
functions and factors retain a closer contact with the patterns of growth 
involved, by virtue of the association between the angle included by 
two such vectors and the degree to which the patterns of growth which 
they represent are interlocked. Olson and Miller suggest that multi- 
variate techniques obscure the existence of sets of closely correlated 
characters constituting a pattern of growth as the organism develops; 
but this is true only if the analyst is so preoccupied with the require- 
ments of significance testing that attention is focussed on the dispersion 
of the groups to the exclusion of a proper study of their mutual orien- 
tation in the hyperspace (cf. Olson and Miller [1958], Blackith [1957)). 

There is in the extensive literature of factor analysis the suggestion 
that a deliberate selection of extreme organisms, deviating from the 
normal in some defined fashion, could be used to orient the relevant 
factor along an identifiable dimension of variation. Thurstone [1947] 
describes in his book the theory of a form of factor analysis and speaks 
in his introduction of the value of ‘freak’ individuals for this purpose, 
but the next step of linking factors to discriminant functions seems not 
to have been taken. There is, in practice, a continuous gradation, 
from the application of discriminatory analysis to distinct, polymorphic, 
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groups of organisms at the one extreme, to the extraction of latent 
vectors from dispersion (covariance) or correlation matrices derived 
from supposedly homogeneous groups of organisms: for there will be 
many unrecognised polymorphs in such a group, and the real problem 
is the degree of relevance, rather than the existence, of such genotypic 
or phenotypic diversity. The case of a homogeneous group may also 
be treated by extracting factors from its correlation matrix, using 
communalities in the leading diagonal of the matrix. These factors 
may then be rotated if by so doing the interpretation in biological 
terms is facilitated. 

The multiplicity of polymorphic variation in the grasshoppers and 
locusts lends itself particularly to the illustration of these general 
comments. To the construction of generalised distance charts (Rao 
[1952], Blackith [1957], Hughes and Lindley [1954]) and analyses along 
canonical variates (Rao [1952], Blackith and Albrecht [1959]) we add 
to our working techniques the computation of the mutual orientation 
of the several kinds of vectors. Between any two vectors a and 8, 
there is an angle @ given by the expression (Thrall and Tornheim [1957]): 


cos @ = (a, 8)/|a|-| 8 | 


where (a, 8) is the inner product of the two vectors, and the denominator 
consists of the product of their lengths, in each instance the square 
root of the sums of the squares of the coefficients (loadings, in the case 
of factors). The length of a vector is quite distinct from the general- 
ised distance by which that vector is able to separate two groups of 
organisms: in the ordinary practice of factor analysis the factors are 
of unit length. | 

Vectors which are almost orthogonal may convincingly be held to 
represent independent patterns of growth; those which are nearly 
parallel may be based on a common pattern of growth, or they may 
represent the operation of distinct physiological stimuli having the 
same morphometric consequences. They are then said to be symbatic 
in the sense of Blackith and Albrecht [1959]. It will in general be hard 
to distinguish correlated vectors which are interlocked at the physio- 
logical level, perhaps for genetic considerations, from correlated vectors 
representing physiologically independent patterns of growth which 
happen both to have been elicited by some external stimulus triggering, 
perhaps, a hormonal control to which both respond. 


EXPERIMENTAL MATERIAL 


The stable colour patterns of several species of grasshopper are 
associated with differences of shape which, in relation to samples of 
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some 100 individuals, can often be distinguished by significant differ- 
ences of single characters, and well appreciated by a suitably chosen 
suite of 10 characters. In this work 1,052 grasshoppers of the genus 
Chorthippus, and 375 of the genera Omocestus and Stenobothrus were 
collected from the grounds of the Field Station during a single season. 
The ten characters, chosen so as to cover, as far as was practicable, the 
major areas of the body, were then measured. These characters are 
set out in Table 1, together with the latent roots and vectors of the 
dispersion matrix for Omocestus and Stenobothrus, and the canonical 
roots and vectors illustrating the dispersion of the different species and 
polymorphs of Chorthippus. The dispersion matrix for the first two 
genera is a pooled covariance matrix based on the eight groups (two 
species, two sexes, and two colour forms) whose interrelationships have 
already been examined in terms of a generalised distance chart by 
Blackith and Roberts [1958]). These authors also published the dis- 
persion matrix. 

Of the ten characters the only one that calls for special comment 
is the ‘reduced weight’. This measure of general size comprises the 
dry weight of the head and thorax and their appendages, with the 


TABLE 1 
SIGNIFICANT LATENT AND CANONICAL Roots AND ASSOCIATED VECTORS 


Significant Latent Roots Significant Canonical 
Omocestus and Stenobothrus' Roots Chorthippus* 


Root 16.0867 0.5157 0.3206 67.7668 24.5145 
Character Vector Elements 
Number of 
antennal segments 0.0523 1.0000 —0.1027 | —0.0028 —0.0075 
Width of head (mm.) 0.0215 0.0141 0.0155 | —0.6185 1.0000 
Pronotal width (mm.) 0.0197 0.0146 0.0098 1.0000 —0.0713 


Hind femoral length(mm.)} 0.0929 0.0928 0.2688 | —0.0139 0.0676 
Hind femoral width (mm.)} 0.0233 0.0024 0.0008 0.0008 —0.0950 
Prozonal length (mm.) 0.0110 0.0055 —0.0095 | —0.0572 0.0616 
Metazonal length (mm.) 0.0150 0.0160 0.0555 | —0.0177 0.1996 
Front femoral width(mm.)| 0.0046 —0.0025 0.0068 | —0.2992 —0.4445 
Elytron length (mm.) 0.0847 0.0694 1.0000 | —0.0094 —0.1020 
Reduced weight (mg.) 1.0000 —0.0678 -—0.1056 | —0.0000 -—0.0092 


1The first three (of seven) significant latent roots of the dispersion matrix for the genera Omocestus 
and Stenobothrus. 
2The two significant canonical roots for the analysis of dispersion of the genus Chorthippus. 
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abdomen cut away. By this device the vagaries of the development 
of the reproductive organs are prevented from obscuring differences 
of body-size. 


THE LATENT ROOTS AND VECTORS OF 
THE DISPERSION MATRIX 


Blackith and Roberts [1958] have already shown that the genera 
Omocestus and Stenobothrus exhibit not more than two basically distinct 
patterns of growth; these were the sexual dimorphism and the symbatic 
differences of form between the species, and between the two colour 
varieties within the species. These general conclusions are reinforced 
by the representation of these various groups on the canonical variate 
chart of Figure 1. 

The extraction of the latent roots and vectors of the pooled dis- 
persion matrix, on the other hand, indicated that far more of the 
potential patterns of growth existed in these insects, but remained 
latent in the sense that they orient the residual variation of the sup- 
posedly homogeneous groups along preferred dimensions of variation. 
Of the ten roots, seven are significant in that they can be distinguished 
from an isotropic residue by Lawley’s [1956] test. In a less relevant 
sense, all the roots are significantly greater than zero (Kendall [1957]). 
It so happens that seven factors, extracted in an orthodox factor 
analysis, would be sufficient to account for the associations represented 
by the covariance matrix, (Thurstone [1947]). The appearance of as 
many factors in the 10 X 10 matrix analysed here suggests that with 
such large samples one can detect many of the latent patterns of growth 
exhibited by the insects even when only a few of these patterns have 
been elicited sufficiently to produce actual polymorphism in the pheno- 
types. 

The first three latent vectors of the matrix are dominated by indi- 
vidual characters. So close are they to being unit vectors that the 
first of them makes an angle of only 8°10’ with the unit vector 
(0, 0, 0, 0, 0, 0, 0, 0, 0, 1). The second latent vector and the unit 
vector (1, 0, 0, 0, 0, 0, 0, 0, 0, 0) contain an angle of no more than 
7°46’, whereas the third latent vector makes an angle of 17°20’ with 
(0, 0, 0, 0, 0, 0, 0, 0, 1, 0). Thus these three latent vectors are repre- 
sentative of three independent patterns of growth each primarily 
responsible for one of the three characters: reduced weight; number of 
antennal segments; and elytron length. Between them, these patterns 
of growth account for some 99% of the total variation, but there is a 
distinction to be drawn between the numerical importance of a growth 
pattern, particularly in this context, and its biological interest. One 
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does not need to perform elaborate analyses to discover that grass- 
hoppers vary greatly in size; what is important is that this size variation 
should be adequately segregated so that other and more interesting 
sources of variation may properly be assessed. The analogy with 
fertility gradients in agricultural experiments is apparent. 


IDENTIFICATION OF LATENT VECTORS AND FACTORS BY 
MEANS OF DISCRIMINANT FUNCTIONS 


The latent vectors of the dispersion matrix have properties similar 
to those of the reference vectors of Thurstone [1947]. In this instance, 
each of the orthogonal vectors reflects the discriminatory capacity of 
an isolated character, the remaining characters having negligible load- 
ings on the vector. Nevertheless, the latent vectors calculated here 
depend to some extent on the unequal variances of the characters in 
the dispersion matrix, whereas the factor analyst generally prefers to 
standardise his variances by extracting the factors from a correlation 
matrix. 

To satisfy these requirements to some extent, the first three factors 
have been extracted from the correlation matrix formed from the 


TABLE 2 


APPROXIMATE LoADINGs IN First Taree Factors OF THE 
Omocestus-Stenobothrus CORRELATION MatRIx 


Character First Factor Second Factor Third Factor 
Number of antennalsegments 0.42 0.86 0.08 
Width of head 0.86 —0.01 —0.27 
Pronotal width 0.79 —0.09 —0.09 
Hind femoral length 0.81 —0.06 —0.21 
Hind femoral width 0.68 —0.01 0.14 
Prozonal length 0.57 —0.08 —0.67 
Metazonal length 0.61 —0.09 0.08 
Front femoral width 0.56 —0.16 0.21 
Elytron length 0.58 —0.26 0.39 
Reduced weight 0.87 —0.10 0.08 


original dispersion matrix, by the centroid technique. The loadings 
on these factors are shown in Table 2. The most general change is a 
rotation of the factors relative to the framework of reference afforded 
by the latent vectors. There is, in addition, some loss of orthogonality 
attributable to the fact that the second factor is more closely tied to 


‘ 
i 
{ 
j 
7 
| 


PATTERNS OF GROWTH 35 


the second latent vector than are the other factors to their latent 
vectors; the angle between this second pair is 28°, whereas the first 
and third pairs of vectors include angles of some 60° each. 

The assessment of as many as seven patterns of growth by com- 
parison of the latent vectors with the appropriate discriminant func- 
tions presents severe practical difficulties because no one species of 
grasshopper exhibits all the types of identifiable polymorphism to 
which the Orthoptera Saltatoria as a whole are subject. For instance, 
non-swarming grasshoppers such as those examined in this paper are 
only rarely sufficiently abundant to show phase polymorphism even in 
Siberia; in England such population densities may never occur. These 
species have been reared in captivity only with severe larval mortalities. 
By contrast, the swarming locusts do not show the clear-cut colour 
polymorphism characteristic of the species discussed here. 

Nevertheless, useful contrasts between the vectors can be made. 
The discriminant function describing sexual dimorphism is essentially 
the same for many different species and even genera (cf. Table 3). Thus 
the vector separating males and females of the ‘dorsal stripe’ variety 


TABLE 3 


CoMPARISON OF DiscRIMINANT FuNcTIONS TO DistiINGuUISH INDEPENDENT 
OR SYMBATIC PATTERNS OF GROWTH 


Discriminants contrasting 
Sexes of Sexes of Colour varieties Green males of 
Chorthippus Chorthippus of Chorthippus C. parallelus and 
Character parallelus brunneus parallelus O. viridulus 
(variety dor- (variety 
sal stripe) brown) (females) 

Number of anten- 

nal segments 1.78 —3.10 —0.36 —4.27 
Width of head 89.85 65.31 —15.55 11.74 
Pronotal width —6.66 2.12 23.51 5.12 
Hind femoral 

length 7.74 7.50 —0.72 —2.38 
Hind femoral 

width —5.64 10.98 —3.95 6.46 
Prozonal length 6.41 —12.13 —2.75 —45.97 
Metazonal length 14.25 14.90 —8.11 44.14 
Front femoral 

width —46.81 — 42.50 9.19 —4.17 
Elytron length —9.15 4.17 0.44 9.44 
Reduced weight —3.48 —4.88 —0.37 —2.30 
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of Chorthippus parallelus (Zett.) and the corresponding vector dis- 
tinguishing the sexes of the ‘brown’ variety of C. brunneus (Thunb.) 
subsume an angle of no more than 17° despite the fact that in C. 
brunneus both sexes are macropterous, whereas in C. parallelus the 
males are brachypterous and the females micropterous. Sexual di- 
morphism appears to reflect not only some general differences of size 
but also of elytron length, head-width, and that width of the front 
femur which Michel Verdier (Private communication) has found to 
be a secondary sexual character in other Orthoptera. 

The colour polymorphs differ somewhat in size, but the associated 
distinctions of shape are different from those elicited during the develop- 
ment of sexual dimorphism. As an example, the variety ‘green with 
brown sides’ of C. parallelus is significantly larger than any of the other 
four varieties, (mean reduced weight of 100 females 22.48 mg.) whereas 
the variety ‘dorsal stripe’ is significantly lighter than any other (mean 
reduced weight of 100 females 19.85 mg.). Yet the discriminant func- 
tion which distinguishes these varieties makes an angle of over 50° 
with that, mentioned above, between the sexes of the dorsal stripe 
variety. One feature of obvious importance is the shape of the pro- 
notum; the brown variety, penultimate in size, has a pronotum signifi- 
cantly longer in both prozona and metazona than any other variety. 
The dorsal stripe variety, the smallest of all, has the second longest 
pronotal components. The general structure of the polymorphism is 
ascertainable from the angles made by the various discriminant func- 
tions. The vector distinguishing female C. parallelus of the varieties 
‘green with brown sides’ and ‘dorsal stripe’ is almost at right angles 
(80°) to that separating ‘green’ males of the two species C. parallelus 
and Omocestus viridulus (L.). In this respect speciation within the 
genus Chorthippus appears to have exploited different paths from 
those utilised in the differentiation of Omocestus and Stenobothrus for 
which the discriminants separating the colour varieties are symbatic 
with those separating the species. This diversity of evolutionary 
pathways is examined more closely below, in the canonical analysis. 

The various discriminant functions form a network in the 10- 
dimensional hyperspace delineated by the latent vectors as axes. In 
such a network the groups of organisms are separated by the appropriate 
generalised distances in a number of dimensions effectively smaller 
than the dimensionality of the hyperspace (Blackith [1957]). In these 
charts the mutual orientation of the groups is first established, and the 
underlying dimensions of variation put in by inspection. When 
canonical vectors or latent vectors are used, the axes ate computed 
first, and then used as orthogonal frameworks of reference within which 
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the groups can be located. Some forms of factor analysis may be 
considered to fall between these two extremes. 


ANALYSES ALONG CANONICAL VARIATES 


Figure 1 shows the ensemble of grasshoppers located in this way 
along the canonical variates generated by the only two significant 
roots of the determinantal equation: 


IB — = 0 


where A is the pooled dispersion matrix for the genus Chorthippus 
and B is the corresponding matrix representing the dispersion of the 
different groups within the genus (Table 1). The remaining two genera 
were added to the chart when it became clear that the two significant 
dimensions of variation in Chorthippus, which account for some 98.6% 
of the total variation within this genus, are symbatic with those already 
established for Omocestus and Stenobothrus. 

The concordance of the directions of the vectors representing 
sexual dimorphism in these five species is noteworthy, as is that of 
the colour variation. The chart confirms that speciation in Chorthippus 
has been accompanied by changes of shape distinct from those ex- 
ploited by the putative common ancestor of Omocestus and Stenobothrus. 
In Chorthippus the colour polymorphism and the specific differences 
of shape are virtually at right angles to one another; whereas in the 
other two genera they are parallel. 

There is little consistency in the relative positions of the colour 
varieties of the different species on the chart. This observation con- 
firms and extends that of Blackith and Roberts [1958] that one cannot 
predict that varieties of different grasshopper species sharing a common 
colour pattern will also share other phenotypic attributes, as Rubtzov’s 
[1935] interpretation of Vavilov’s rule would suggest. 


DISCUSSION 


The numerical description of the shape and size characteristic of 
definable groups of organisms has been developed over many years 
by Cousin [1956] and the factorial analysis of such ‘types structureaux’ 
has been treated by Teissier [1955]. The present work has been in many 
ways an application of the techniques of multivariate analyses to the 
ideas which these and other authors have sought to consolidate. 

One way of making multivariate methods more generally conformable 
with the taxonomists’ requirements is to assess large numbers of 
characters on each organism (say 100 characters) and to reduce the 
then prohibitive labour of measurement by qualitative appreciation 
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of the different characters, as suggested by Michener and Sokal [1957] 
and by Sneath [1957]. The present investigation does not go so far 
as these suggestions, preferring to retain a closer contact with the 
changes of shape during speciation. 

The latent vectors of a dispersion matrix, representing in this 
instance the underlying modes of growth of the organism, together 
with the factors on the one hand; the canonical variates and the dis- 
criminant functions contrasting the differential operation of such 
modes of growth as have actually been realised in the phenotype, on 
the other hand; each bring their own contribution to the study of the 
patterns of growth. No doubt the topic would be further illuminated 
if each stage in the growth of the organism could be adequately measured, 
instead of the adult stage alone. Kermack [1954] has shown that one 
cannot rely on the practice of taking sections at different stages in the 
allometric growth of organisms to provide indications of the changes 
of shape accompanying speciation, and more general evolutionary 
processes. Such processes are associated with changes of shape which 
are distinct from those of normal growth, at least in the echinoid 
Micraster. 

In the grasshoppers, despite the interlocking of growth in the 
elytra, and the segmentation of the antennae, with the size of the 
insect, these three characters are essentially independent in the sense that 
in individuals variations of the three take place independently, as the 
latent vectors demonstrate. But when different species are concerned, 
there is a general association between the three characters, as Mason 
[1954] showed in an extensive survey made less useful by inappropriate 
statistical processing. However, Kevan [1957] has noted that in the 
genus Chrotogonus the elytra may be anything from one-tenth to twice 
the length of the hind femur, used as a rough measure of the size of 
the organism. 

One might hazard a guess that it is on the degree of interlocking of 
these otherwise independent patterns of growth, rather than on the 
individual characters, that natural selection operates. A _ criticism 
levelled at any concept of evolution which demands the roughly simul- 
taneous modification of innumerable individual characters is the low 
probability that all such changes could be accomplished by haphazard 
variation within the requisite time. Such a criticism is of greatly 
reduced force when the characters are associated in a relatively few 
patterns of growth. In locusts and grasshoppers, the number of funda- 
mentally distinct patterns of growth has been shown to be small com- 
pared with the number of characters measured. Recently, the same 
state of affairs has been demonstrated in social wasps (Blackith [1958]) 
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and in the Mirid bug Plagiognathus (Southwood and Blackith, un- 
published) and is probably general. 

Wigglesworth has stressed that apparently qualitative differences 
of form in insects, showing as clearcut polymorphism, are in fact only 
quantitatively distinct (Wigglesworth [1954]). To reduce the plethora 
of individual characters to a few patterns of growth would be a simpli- 
fication in the same vein, if the polymorphism were to be considered 
in terms of the suppression or enhancement of the degree of interlocking 
of patterns of growth. Taking the evidence marshalled by Olson and 
Miller [1958] with that presented here, there seems to be considerable 
justification for such a view of morphological integration. 

The demonstration of the existence of a relatively small number of 
independent patterns of growth in insects, albeit contrary to the ex- 
perience of Sewall Wright [1954] goes a long way towards justifying 
reservations about the empirical usage of very large numbers of quali- 
tatively appraised characters: it is to this extent in line with Olson and 
Miller’s [1958] emphasis of the importance of suites of closely correlated 
characters among the collected attributes of an organism. Diagrams 
of taxonomic relationships, as for instance the ‘trees’ obtained by 
Michener and Sokal [1957] by means of their weighted variable group 
method, imply that evolution proceeds in one general direction, and 
measures the extent to which progress has been made along this general- 
ized evolutionary pathway by a particular taxonomic entity. The 
burden of the present paper is that more than one direction is involved 
even at the infraspecific level, and that it is at least as useful to know 
where the process is leading as to know how far it has proceeded. 
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A SIGNIFICANCE TEST FOR THE SEPARATION OF TWO 
HIGHLY MULTIVARIATE SMALL SAMPLES 


A. P. DEMPsTER 


Harvard University 
Cambridge, Massachusetts, U. S. A. 


Introduction 


A statistical technique which has been mathematically discussed 
in [3] is here given an explanatory derivation with formulas and illus- 
tration by application to a particular example whose basic data may be 
found in [1]. In this example an experiment is performed on 12 human 
males from 16 to 39 years old showing no alcoholic tendencies. On 
each individual 62 biochemical items were measured consisting of 8, 
30, and 16 items from analyses of blood serum, urine, and saliva re- 
spectively, 5 taste thresholds and 3 phagocytic indices. The question 
is roughly: Is there evidence that the 62 items could be used to dis- 
tinguish between alcoholic and non-alcoholic? This data has been 
much analyzed and has motivated a number of new techniques in- 
cluding the present one. For further references to these analyses see 
Chung and Fraser [2]. These authors present significance tests based 
on the randomization argument of Pitman [6]; their methods will be 
compared with the present method at the end of this paper. 

In general a type of individual or object is contemplated on each 
example of which a large number k of different characteristics may be 
measured, and it is supposed that 2 groups of such individuals can be 
distinguished by means apart from the k measured items. Suppose a 
small sample to be available from each group so that the basic data 
consists of k items measured on each member of 2 samples. Now many 
questions may be asked about the kind and degree of the relationship 
between the k measured variables on one hand and on the other hand 
the two-valued variable which assigns each individual to his proper 
group. Our concern is with a situation in which the samples do not 
show a clear and meaningful relation between the group of an individual 
and his measurements on a single item or small set of items but where 
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it seems reasonable that all k characteristics might be used to define 
a relationship of sufficient strength to produce statistical significance 
on a test with fairly small samples. This may be described in statistical 
terms as a multivariate 2-sample problem where the aim is a significance 
test to distinguish between the populations sampled, and it should not 
be confused with the apparently more difficult problem of patterning 
where a single sample is presented as an unknown mixture of 2 groups 
and the objective is to find a grouping which predominates in some 
sense. 

The test to be proposed here is based on similar theory and directed 
at the same kind of difference as the usual 2-sample ¢-test or its classical 
multivariate generalization using essentially Hotelling’s T’. The 2 
groups are populations whose means are 2 points in k-space and whose 
scatter about these means is largely described by the within-population 
variances and covariances of the k measures. The question is whether 
the population means are sufficiently separated relative to the scatter 
about the means to be shown significant from samples of sizes n, and nz . 
The theory assumes homogeneity of variances and covariances within 
populations and also multivariate normal distributions, but in practice 
the method may be expected to share certain robustness qualities with 
analysis of variance techniques. The present method is a substitute 
for T’ made necessary because 7” is undefined for k > ny + n. — 2 
and in any case requires inversion of a matrix of order k which is im- 
practical for large k. In avoiding these difficulties of T? we find it 
necessary to give up the desirable affineness property of T? whereby 
the same 7” results from any k linear combinations of the k variables 
used in place of the k given variables. This necessitates more care in 
the choice of variables to be used as input and leads to a discussion 
of general multivariate transformations. 


Derivation of the Test 


The input for the analysis is taken to be k items, typically shrewdly 
chosen functions of measured variables, available on each of n 
individuals. This data may be represented by an n X &k matrix 
X = (x,,;) where z,; is the value of item j for individual r. The rth 
row of X may be denoted by the vector X, and is the set of item values 
for individual r. An individual will be thought of as a single observation 
drawn randomly from a multivariate population and we will denote 


the first- and second-order moments of such a multivariate population 
by 


ave {X,} = M, and var {X,} = L, 
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where M, is al X k matrix (m,, , m,., , m,) and Lrisak Xk 
matrix (I,;;). This notation means 


ave {z,;} = m,;, var {z,;} = 1,;; 
and cov {2,; = 


In our case we assume the first n, individuals are a sample from one 
population and the next n, are a sample from the second population 
where n = n, + m2, and so we may write 


ave {X,} = M’ for l1<r<n 


and 
ave {X,} = M” for n.+1<r<n. 


Also we assume both populations to have the same variances and 
covariances among items, and so we may write 


var {X,} = L for l1<r<n. 


Here M’, M” and L are unknown matrices and our concern is to test 
the hypothesis that M’ — M” is the zero vector. 

Now each of the n individuals corresponds to a single degree of 
freedom (d.f.) and, as often done in univariate work, a coordinate 
change can be made to yield n new orthogonal single d.f., one corre- 
sponding to the overall mean, one to the difference between sample 
means, and the remainder to within-sample variation. This change 
of coordinates amount to finding Y = AX where Ais ann X n orthogonal 
matrix and the rows of n X k matrix Y, namely Y, , Y, , --- , Yn, 
correspond to the n new orthogonal d.f. The first row of A produces 
the first new d.f. corresponding to the grand mean and therefore must 


be 


n’n’n’ n 


and the second row corresponding to differences of sample means 
must be 


4 1 
(14, terms 4, 


ins terms] ---) / + 4. 


The remaining rows are arbitrary except that they must satisfy the 
conditions for orthogonality of A. 
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The new orthogonal vectors Y, , --- , Y, have the first and second 
order moments 


ave {Y,} = (n,M’ + nM’)/+/n, 


ave = (M’ — M”) / +3. 


ave {[Y,} = O for 3<r<n, 
and 
var {Y,} =L for l1<r<n. 


And from these formulas a method of detecting non-zero M’ — M” 
appears naturally, for, except for the shift in mean due to non-zero 
M’ — M”, Y, has the same mean and variance as each of Y;, --- , Y,. 
Thus we might expect non-zero M’ — M” to show up through Y, being 
longer than Y, , --- , Y, in some average sense. Accordingly we propose 
to base a significance test on 


F = Q./[((Qs + + Q,)/™ — 2)] 


where Q, is the squared length of Y; , i.e. Q; = Y;¥{. By introducing 
the notion of length into the definition of the test we also introduce 
a type of non-uniqueness under linear transformation of the k variables, 
but we postpone a discussion of this point. 

For definite distribution theory we assume the X; to be samples from 
multivariate normal distributions determined by the means and 
variances as described. Then Y, , --- , Y, are also independent and 
normally distributed with means and variances as described, so that, 
under the null hypothesis M’ = M”, Q, , --- , Q, are independently 
distributed as a positive quadratic form in normal variables, which 
distribution depends on all the parameters in L. Fortunately, it is 
generally a good approximation to use a x’-shaped distribution for Q, 
i.e., write Q ~ mx? meaning Q is approximately distributed as m times 
a x’ random variable on r d.f. This results in 


F 


where, under the null hypothesis, F,,,.-2), denotes an F-type random 
variable on r and (n — 2)r d.f. 

In this way the dependency on unknown parameters of the distri- 
bution of F is reduced from L to the single parameter r. It is known 
that r < k, and r may be thought of as a reduced dimensionality from 
an ideal dimensionality k which would hold if L were a unit matrix. 
However, r is unknown and so an exact significance test cannot be 
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based on F. We avoid this difficulty by using an estimate f of r 
and testing F as though it were F,,,.-2), . This inevitably results in 
distortion of significance levels but it is shown in [3] that this distortion 
is slight. 

Two methods of estimating r are as follows. The first uses only 
Q;,Q:, ++: ,Q,. Supposing these Q-values to be a sample from mx? , 
there exist [3] sufficient statistics for m and r, and one of these depends 
only on r, viz. 


da)]- 


It can be shown that [3] 


t~ + 


is a good approximation even for small r so that a good estimator f, 
can be defined from 


1 
1+ 
1 n—2 
A more precise estimator of r can be constructed by making use of the 


angles among vectors Y;, , --- , Y,. If @ is the angle between 2 such 
vectors, it can be shown [3] that, analogous to the method above, 


—Insin’ 6~ (2 + 


and that the . * ?) angles are approximately pairwise independent. 
Thus if — u is the sum of the natural logs of the squared sines of these 


angles, 
1 3\2 


Thus a second estimator #, may be defined from 


1 
1+ 
This completes a description of how to use the method but several 
properties which might be considered disadvantageous remain to be 
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discussed. The first property concerns the use of the notions of length 
and angle in k-dimensional space, i.e., the non-affineness property. If 
in place of the k given variables k linear combinations of the given k 
are used as input for the technique, then vectors will have different 
lengths and angles and different significance levels will result. In this 
sense the test is non-unique. This does not affect the validity of the 
test as long as the choice of input variables is fixed apart from the 
data. In fact the choice of variables is a means of feeding in prior 
information or hunches about the variables and should be used by an 
experienced investigator to increase the sensitivity of the test. How- 
ever, if the non-uniqueness is disturbing to the reader, it might help to 
point out that the test based on Hotelling’s 7” only gives a unique 
result provided the transformations allowed are restricted to linear 
transformations. In general one could define k’ (k’ < k or k’ > k) 
functions of the k given measured variables and use these k’ variables 
as input. Under such general multivariate transformations, neither 
T’ nor the present technique would produce a unique result, nor is it 
possible that any useful technique could provide a unique result. 

A second property of the test which may seem a practical dis- 
advantage is that the last (n — 2) rows of matrix A were partly arbitrary. 
It can be easily shown that this arbitrariness does not affect at all the 
value of F, but only #, and #, which are of secondary importance. The 
extent of the differences of significance level resulting from different 
choices of A is investigated in the following example by repeating 
the calculations for 5 choices. If A, , A, , --- , A, are the rows of A, 
the A, and A, are determined and one method of assigning values 
to the remainder is to choose them as unit vectors spun randomly 
uniformly with regard to direction in n-space subject to the perpen- 
dicularity constraints. To realize such vectors we may begin with 
(n — 2) 1 X n vectors C; , C, , --- , C, whose entries are n(n — 2) 
random normal deviates. Then find 


B; = C; — (A,C3)A, — 
and 
A; = a;B; where a3, = 


Here B, is the component of C, perpendicular to A, and A, and a; is a 
constant chosen to reduce B, to unit length. Next find 


B, C, (A,C))A, (A.Ci)A, 


Il 


and 


A, == a.B, where a= (B,Bi)"” 
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and so on to 


B, 


and 


A, 


a,B, where a, = (B,B/)’”. 
Example 


This example using the data in [1] is presented to illustrate the 
computations and does not pretend to represent expert use of the data 
from the transformation point of view since the variables were little 
altered from their raw form except that logs were taken of the taste 
threshold variables and all the variables were scaled very roughly to 
have similar spread. 

The calculations were performed on an IBM 650 programmed by 
the author, with no previous computer experience, using the interpretive 
system [4]. The first step was to read in the 12 X 62 matrix X and 
reduce immediately to W = XX’ on which further computations were 
performed. The last 10 rows of A were determined using the method 
of random choice just described. The entries of C; , --- , Cyz were 120 
random normal deviates produced internally by the machine using the 
subroutine [5]. From A and W the Q; were computed for 2 < 7 < 12 
from the formula 


Q; = A;WA; 


and from the Q; previously given formulas were used to find F and 
f, . In order to find f, , it was necessary to compute quantities Q,; 
for 3 <7 <j < 12 from the formula 


= A;WA; 


and from these the squared sine of the angle between Y, and Y; is given 
by 


Qi, 
2.0; 


sin? 6 = 1 — 


From these squared sines u and f, are found directly. The random 
choice of A was made a total of 5 times to check empirically the varia- 
tions in 7, and f, which might be expected. A whole single stage program 


consisting of finding W, A, F, ¢ and u required about 45 minutes running 
time. 
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The results are as follows: 


F = 1.382 

= 28.5 = 28.6 
29.1 29.0 
27.7 28.0 
27.1 28.1 
18.9 31.4 


Each estimate f results in a significance level of F determined from the 
percentage points of the distribution of F; 19, corresponding to 1.382. 
These may be roughly interpolated in percents from the commonly 
available tables of F at 5%, 10% and 25% as: 


9.8 9.8 
9.6 9.6 
10.2 10.0 
10.4 10.0 
14.4 8.2 


The conclusions are that F is significant at around the 10% point. 
The r-values of around 28 may be interpreted by saying that the actual 
dimensionality of 62 was reduced to an effective dimensionality of 
around 28. Thus there is some indication, not very positive, that Q, 
is larger than it should be under the null hypothesis. The result is 
tantalizing because it leaves unanswered the question of whether 
some shrewder use of composite variables as input might have led to 
more definite results, either by increasing F or by increasing r to nearer 
its limit. Increasing r means choosing variables as input which are 
more nearly uncorrelated than those used. Increasing / means choosing 
variables which discriminate better. One must of course resist the 
temptation to choose transformed variables on the basis of the observed 
data. The variation in significance levels due to different *, estimates 
does not apnear to be of practical importance, but that due to different 
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#, might possibly be considered important. The last of the 5 choices 
of A appears to produce outliers of a sort in #, and f, , but the compu- 
tations were checked and seem correct. 


A Randomization Test 


An alternative method of attaching a significance level to F is the 
randomization test similar to that proposed by Pitman [6] for the 
univariate 2-sample f-test. Suppose / were computed for each of 


(" ) ways of dividing the n individuals into samples of size n, and nz . 
1 


Suppose these F values were ranked and the F corresponding to the 
true division into samples had rank S. Then, under the null hypothesis 
that the 2 populations have identical distributions, S would be equi- 


probably distributed over the integers 1 to (" ) Thus the formula 
1 


Prob (S < s) = 


gives a means of attaching a significance level to F. It turns out to be 
equivalent to rank the lengths of the vectors joining sample means, 
and this simplifies calculations. 

For our example the calculations proved feasible on the IBM 650 
and the resulting value of S produced after about 1 hour was 438. This 
corresponds to a significance level of 100 


E ~ {138 / = 11.5%. 


The close agreement with the previous method provides encouragement 
that the non-exact test based on normal distribution theory provides 
robust theory for attaching significance levels. 


Comparison with Chung and Fraser 


The test proposed by Chung and Fraser [2] yield slightly lower 
significance levels, 8.3% and 7%, when applied to the same data. If 
in fact the two populations are different, this suggests that the Chung 
and Fraser tests are more sensitive than those proposed here. On the 
other hand we must note that we do not know that the populations 
differ and, if the populations are the same, a comparison of these par- 
ticular significance levels gives no information on the relative sensitivity 
of the tests. No comparison of power functions has been attempted. 

The methods in [2] base the test on criteria based on ranks and 
implicitly resolve the problem of transforming the data. The author 
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wishes to leave the choice of variables more up to the data analyzer 
and believes that in principle this choice is a means of feeding in a 
priori knowledge which could result in a much more sensitive test. 


Conclusions 


A parametric technique is presented which is based on statistics 
with natural geometrical interpretations. The type of non-exact sig- 
nificance test proposed is somewhat unusual but its results check quite 
well with results achieved by using permutation theory on the same 
test criterion. Both the normal theory test and the permutation theory 
test are feasible with an electronic computer, but as sample sizes in- 
creased the complete permutation test would tend more quickly to 
run beyond the bounds of computer feasibility. 

The author acknowledges helpful criticism from H. Fairfield Smith. 
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AN ECOLOGICAL DISTRIBUTION AKIN TO FISHER’S 
LOGARITHMIC DISTRIBUTION 


J. H. Darwin 


New Zealand Department of Scientific and Industrial Research, 
Wellington, New Zealand 


1. Introduction and Summary 


In several publications (e.g. [1947] [1950]) C. B. Williams has used 
the properties of Fisher’s logarithmic distribution (see Fisher, Corbet 
and Williams, [1943]) to characterise the abundance of species of 
animals or plants in a sampling area. The distribution has been used 
to describe data consisting of the number of species of which one or 
more members have been “caught” in a sampling area or time. The 
logarithmic or log distribution implies that the number of species of 
which n members have been caught is proportional to x"/n where z is 
a positive constant less than one. The coefficient of proportionality 
is usually written a and called the index of diversity. It is a measure 
of the richness of the biological association in the area and it is in- 
dependent of the nature of the sampling. The constant x however 
depends on such things as the time spent in sampling and the “volume” 
of the sample (e.g. the size of a quadrat or the size of a light trap). 

It is sometimes impossible or inconvenient for all the individuals 
of a sample to be counted. Often it is only practicable to observe 
whether or not a species has occurred at all. Two examples of such 
data have come the author’s way. In the first example the data were 
the different numbers of moss species that had been found in one, two, 
three, four, five, or all of the principal islands of Hawaii. These data 
were kindly supplied by Professor A. R. Gemmell of the University 
College of North Staffordshire, England, in 1949. The moss species 
were of three reproductive kinds, dioecious, monoecious, and sterile. 
Some of the dioecious species were found in only a few of the islands, 
some in nearly all the islands, while a smaller number were found in 
about half the islands. The monoecious species were mostly found 
in only a few islands although there were a few found in nearly all. 
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The sterile species were the least widely spread of all, few of them 
being found in more than one island. In fact the decline in the number 
of sterile found in n islands as n increased followed the typical pattern 
of a log distribution up to the natural cut-off at n = 6. 

In the following paragraphs we discuss a simple distribution which 
might be useful to describe data of this moss species kind, and we 
apply it to the moss data and also to recently collected data on some 
New Zealand ciliate protozoa. 


2. Analysis 


The chance p of a species occurring in a sample varies between 0 
and 1. If M samples are taken, the probability that a species is found 
in n of them is the binomial probability 


M!p\(1 — p)“"*/(M — n)! nl. (1) 


Species vary in frequency so that p may be expected to vary. Perhaps 
the simplest flexible form of variation that p can take is that described 
by the beta density function 


(A +B-—1)!p*"(1 — p)*"/(A — DIG — DI. (2) 


This density function has a turning point at (A — 1)/(A —1+B-—1), 
a point which may not fall in the range (0, 1). If both A and B are 
greater than 1, the density (2) is zero at p = 0 and p = 1 and there isa 
peak that is nearer p = 0 or p = 1 according as A or B is the smaller 
constant. If A and B both become large, the density degenerates in 
form being increasingly concentrated around the peak. If A tends to 1, 
the value of p at which the peak occurs tends to 0; if B tends to 1, the 
value of p at which the peak occurs tends to 1. If A is less than 1, the 
density (2) rises to infinity as p tends to 0, and, if B is less than 1, 
the density rises to infinity as p tends to 1; thus a U-shape, an L-shape 
and a reverse L-shape are possible. 

Because of this wide range of shapes that the density (2) can take, 
we may reasonably expect that the natural variation of p amongst 
the different species will often be closely described by a function of 
the family (2). An average of (1) with respect to (2) produces as the 
probability that a species occurs in n out of M samples, 


_ MI A(A +1) (A +n— +1) 


Pn = "nl ? 


n=0,1,---,M. (3) 


This compound binomial probability has been discussed, usually for 
data of a different kind, by several people (for instance Polya [1930] 
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and Skellam [1948]). Skellam in particular discusses finding the con- 
stants A and B by moments. 

Suppose now that there is a number Z of species potentially present 
in the sample and that the number of species appearing in 7 of the 
M sample is z, with }-z, = S and >-nz, = N. The likelihood of 
this is 


s 


Now suppose Z tends to lai A tends to zero and ZA is a constant 
k. Then (4) becomes 


M 
] 
We may also for its own sake derive the probability of an observed 
species appearing in n out of M samples. This probability is 
lim [p./(1 Po) Dn » Say, (6) 
M(M — 1) (M —n+1) ; 
1 1 


For comparison with Fisher’s log distribution we also write for the 
expected number u, of species appearing in n out of M samples 


kM(M — 1) - 
~ nM +B—1)---(M+B—n)’ 


(5) 


(7) 


n=1,2,--+,M. (8) 


We give equations for the estimation of the constants k and B in §3 
and describe applications in §4. The distributions (7) and (8) will, 
when matched to data, give the same expected values u, for the same 
value of B. It is convenient then to refer to either of them as the 
B-distribution. Later, in §5 we shall discuss the distinction between 
them in terms of the situations in which they are used. 

In calculating terms of (7) or (8) it is easiest once the constants 
have been found to compute the term for n = 1 and multiply this 
term by the appropriate number of quantities 


= Unsi/Un = —n)/n+1(M+B-n-1). 
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3. Estimation 


It would be desirable to find maximum likelihood estimates of k 
and B using the likelihood (5) but the calculations are too laborious 
to carry through for such estimates to be generally used. Instead we 
employ moment equations relating the observed number of species 
S, (S = >5z,), and the observed number of appearances NV, (V = >-nz,), 
to their expected values. The equations are 


1 1 1 
sett (10) 
since 
2) = lim — p)" 
A,1/2z-0 
TABLE I 
or B ror VAuuEs oF M ann N/MS 
M 
MS 5 6 7 8 9 10 ll 12 
70 238 213 195 .182 172 165 158 153 
65 311 276 252 .235 222 211 203 196 
60 404 355 323 .300 282 268 257 247 
55 526 457 413 .381 358 339 324 312 
50 691 592 530 .487 455 430 410 393 
45 927 779 689 .628 583 549 522 499 


26 3.543 2.709 2.264 1.986 1.794 1.652 1.543 
M 
13 14 15 16 17 18 19 20 
70 .148 144 140 137 135 132 130 128 
65 .189 184 179 175 .172 168 165 162 
60 .239 232 226 221 .216 212 208 204 
55 301 292 284 277 .271 265 260 255 
50 .379 367 356 347 .339 332 325 319 
45 -480 464 450 438 -427 417 408 401 
40 .616 594 575 -558 544 531 519 508 
35 .808 777 750 726 -705 687 671 656 
32 -966 926 892 862 836 813 793 775 
29 1.173 1.121 1.076 1.038 1.005 -976 -949 -926 
26 1.456 1.385 1.325 1.274 1.230 1.191 1.157 1.127 


.40 1.290 1.053 .916 .826 .761 .712 .674 -642 
.35 1.910 1.489 1.263 1.121 1.022 .948 .892 .846 
.32 2.543 1.896 1.573 1.377 1.244 1.148 1.073 1.014 
‘ .29 3.615 2.513 2.019 1.7385 1.548 1.415 1.315 1.237 
| 
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and 
N = kM/B (11) 
since 
nx,) = lim [E,(MpZ)]. 
Then 


1 1 


N/MS = + (12) 


The quantity N/MS has a physical meaning. It is the average pro- 
portion of samples in which an observed species appeared. It therefore 
lies between 0 and 1. To facilitate the solution of (12) we give in Table 
I the value of B as a function of N/MS and M. We suggest using 
linear interpolation in Table I. Such interpolation will give a value 
of B that is slightly too high and the set of numbers of species appearing 
in n samples may not sum to the required value S. However the dis- 
crepancy is not likely to be important. 

When the B-distribution is being used for graduation only in the 
form (7), the moment equation for B is still (12) and Table I may again 
be used to give B. 


4, Examples 
Example 1: Hawaiian Mosses 


The examples are cited mainly to show the flexibility of the B-dis- 
tribution. It seems improbable that the ecological model used in 
the derivation of the B-distribution will provide a satisfactory expla- 
nation of either the moss data or the protozoa data. In fact, suppose 
we had graduated the data with the truncated form of the compound 
binomial distribution (3) when A was not necessarily small. Then the 
best value of A greater than or equal to 0, is certainly 0 as for the 
B-distribution, but the values of A found from moment equations are 
all less than 0 (for this truncated distribution A can formally take 
values down to —1). 

However as the accompanying table shows, the B-distribution 
provides very satisfactory graduation of the three differently-shaped 
moss distributions. In this table n refers to the number of islands 
occupied, x, to the number of species found in n islands, u, to the 
number predicted, and D, M and St to the dioecious, monoecious and 
sterile mosses respectively. Thus 12 dioecious species were found in 
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TABLE II 
Hawatan Moss Data 

n Din Dun Mz, Mun Str, Stun 
1 12 10.66 15 13.34 22 20.72 
2 6 6.07 6 6.39 “f 8.30 
3 4 4.77 3 4.04 5 4.23 
4 3 4.48 2 2.82 1 2.25 
5 4 5.14 0 2.03 0 1.11 
6 13. 10.88 4 1.38 2 41 
Totals 42.00 30.00 37.02 
.394 B= 1.220 B= 2.238 

k = 9.587 k = 13.827 k = 24.991 

s.e.k = 1.88 sek = 2.94 s.e.k = 5.03 


only one island (not necessarily the same island) whereas for B = .394, 
10.66 were predicted. 

Anscombe [1950] showed that the usual estimate of the parameter 
a of the log distribution could be considered as approximately normally 
distributed. We may adapt his argument to show that, when M and 
B are large, our estimate of k is approximately normally distributed 
with variance k/log.(M/B). We have given in the table the standard 
errors of k deduced from this formula. In any practical case we are 
never sure B and M are big enough for the formula to be a good approxi- 
mation and we suggest it be used in a conservative way, say, by employ- 
ing three and not two times the standard error of a difference of k-values 
as a yardstick of the importance of the difference. 


Example 2: New Zealand Protozoa 


When B and M are large, the B-distribution (8) has the form 
k{M/(M + B)]"/n, which is the form of a log distribution with k for a 
and M/(M + B) for z. It may therefore be expected that the B-dis- 
tribution will often be close in form to a log distribution when the latter 
is calculated on the assumption that the number of appearances plays 
the role of the number of individuals in a sample in Fisher’s work. 
An example exhibiting this closeness is provided by figures on the 
occurrence of ciliate protozoa in ten samples from tussock country in 
Waiouru, New Zealand. These data were kindly supplied by Dr. 
J. D. Stout of the New Zealand Soil Bureau. In the accompanying 
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Table III we give in successive columns the number z, of species found 
in m samples, the number estimated by the B-distribution and finally 
the number estimated by the log distribution when N is taken as the 
number of individual protozoa found in the 10 samples. 


TABLE III 
Protozoa Data 
n Zn B distrn. un Log distrn. u, 
1 8 7.94 9.13 
2 3 3.85 3.98 
3 4 2.48 2:27 
4 i 1.79 1.45 
5 1 1.37 99 
6 2 1.08 Ber | 
7 1 .87 .52 
8 0 .70 39 
9 0 .54 .29 
10 1 38 .20 
Totals 21.00 20.14 
B = 1.274 z= .8546 
k = 8.154 a = 10.892 
s.e.k = 1.99 


Already for these apparently low values of B and M the B-distri- 
bution and the log distribution are not greatly different. Often for 
large B and M there would be little more practical benefit, when gradu- 
ating data, in using the B-distribution in which n is bounded than in 
using the more convenient log distribution which allows n to be infinite. 


5. General Discussion 


The work in §2 naturally suggests a comparison with Fisher’s 
method of deriving his log distribution. He supposes the number of 
individual members of a species present in a sample has a Poisson 
distribution with mean m, say, and that m has a gamma-type dis- 
tribution. This parallels our p having a beta-type distribution. Corre- 
sponding to the compound binomial distribution (3) in his work is a 
negative binomial distribution of the number of individuals a species 
is represented by in a sample. He supposes the average value p» of m 
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tends to 0 in a certain way, that the number Z of species tends to 
infinity but that Zu is constant. He then deduces the expected number 
of species having n members in a sample. The result is usually referred 
to as the log distribution although it is not a distribution in the usual 
sense of that word. 

If Fisher's ecological model is appropriate, it is natural to introduce 
both conste.nts a and x. Then (Anscombe, [1950]) maximum likelihood 
estimates of these can be made using the likelihood corresponding to 
our likelihood (5). The equations of estimation are the same as the 
equations used by Fisher. The quantity a represents some function 
of the abundance and diversity of the species present which may be 
used for comparison with other areas. It has its counterpart in our k 
which is also independent of the nature of the sampling when the 
ecological model is valid. 

If the model is not appropriate, the log distribution may 
still have a use as a graduating curve, but the quantity a, having 
no physical significance, is not introduced. The form of distribution 
x"/[—n log (1 — 2)] is then employed. The maximum likelihood 
estimate of x is the same as that deduced under the assumption that 
Fisher’s model is valid. It is then desirable to look for other models 
that might lead to distributions of log type. Anscombe [1950] lists 
other models of this kind that have been discussed in the literature. 
For instance the negative binomial is the distribution of the population 
size of a species when birth and death rates do not vary with the age 
of the individual and the population is sustaining a steady rate of 
immigration. It is also the expected distribution of the number of 
individuals in a sampling area when the population multiplies and 
spreads from a number of randomly placed centres of fertility. Each 
of these models could lead to a log distribution if again one of the con- 
stants of the negative binomial distribution were small. 

Another type of model has been discussed (Darwin, [1953]). A 
possible hypothesis for the rate of creation of new species is suggested. 
This hypothesis leads to an exponential growth of the number of species 
when each new species grows with constant birth and death rates—then 
the number of species having a population size n can vary in a manner 
very like that described by the log distribution. 

A similar situation to that just discussed for the log distribution 
holds for the B-distribution. If the model is not considered appropriate 
and the distribution is being used for graduation only until a more 
satisfactory model is evolved, it is more consistent with normal statisti- 
cal procedure to consider the distribution in the form (7) in which 
k is not introduced. Of course there is no arithmetical distinction in 
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practice between the forms (7) and (8) since for the same method of 
estimation (maximum likelihood or moment estimation) the same 
value B will be found and the same values u, deduced whichever form 
of the distribution is considered. 

An alternative model has been proposed specifically for presence 
and absence data of the Hawaiian moss kind (Darwin [1953]). In this 
paper the number of species is presumed to grow exponentially. Once 
a species occupies n out of M areas its probability rate of expansion 
into an (n + 1)th area is taken as proportional to n(M — n). This 
leads to a probability of a species being found in n out of M areas that 
is, for suitable values of the constant involved, very like a B-distri- 
bution though different algebraically. The moss data were well gradu- 
ated by this model. 

We note finally two further connections between the B-distribution 
and the log distribution that might be of interest when the model 
leading to the B-distribution is considered a good approximation for 
the ecological association being investigated. 

(a) For the log distribution ax"/n a simple change of constants is 
possible if sampling is increased by a factor u; the constant x becomes 
zu/({1 + 2(u — 1)]. For (7) no such simple exact alteration is possible 
but the equivalent one in which B is replaced by B/u is often a good 
approximation. 

(b) It is of course possible to find a distribution equivalent to the 
B-distribution using the identical distribution assumptions Fisher used 
in his derivation of the log distribution. Then the probability of a 
species appearing in ” out of M samples takes the binomial form (1) 
with p = 1 — exp (—m) where m is the mean of the Poisson distri- 
bution of the number of representatives of a species in a sample. We 
do not give the analysis because the resulting formulae corresponding to 
the B-distribution seem to be too complicated for practical use. 

It is worth: remarking however that, if Fisher’s assumptions are 
used, there will be two moment equations for a and x similar to our 
moment equations (10) and (11) for k and B. These moment equations 
will yield a value of a which will for large B and M be very close to 
the value of k given by our moment equations for the same values of 
N and S. That the values a and k may be close even when B and M 
are not very large is shown by the examples discussed in §4. For the 
four examples, the values (k,a) are, in order, (9.587, 10.34), (13.827, 
14.87), (24.991, 25.15), (8.154, 8.26). The differences between a and 
k are all small compared with the listed approximate standard errors of 
k. One may therefore use our tables of B to obtain a value of k which 
may be taken as a good estimate of the a required by Fisher’s analysis. 
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ON THE NUMBER OF SELF-INCOMPATIBILITY ALLELES 
MAINTAINED IN EQUILIBRIUM BY A GIVEN 
MUTATION RATE IN A POPULATION OF 
GIVEN SIZE: A REEXAMINATION’ 


SEWALL WriGHT 


Department of Genetics, University of Wisconsin, 
Madison, Wisconsin, U.S. A. 


SELF-INCOMPATIBILITY ALLELES IN OENOTHERA ORGANENSIS 


Some twenty years ago, Emerson [1, 2] found an extraordinary 
number of self-incompatibility alleles (at least 37, later stated by Lewis 
[7] to be 45) in Oenothera organensis, a species believed to be restricted 
to a few relatively moist canyons in the Organ Mountains of New 
Mexico within an area of 33 square miles above the 6000 foot line. On 
careful search, Emerson found 154 individuals in the four supposedly 
most favorable canyons. He states: “On the basis of this preliminary 
survey, it is estimated that the entire population of this species con- 
sists of less than one thousand and very likely less than five hundred’. 


ATTEMPTED EXPLANATION OF THE NUMBER OF ALLELES 


The present author [11] attempted, at Dr. Emerson’s suggestion, 
to find the theoretical number of alleles n that would be maintained 
in equilibrium between mutation at a given rate u and the loss of 
alleles from accidents of sampling in a population of specified size N. 
The peculiar mode of inheritance (failure of pollen tube growth in a 
style in which either of the alleles is that of the pollen grain.) did not seera 
to lead to usable exact formulae. Approximations were used that were 
believed to be adequate for numerical estimates. It should be added 
that even if exact counts were available, the effective number of indi- 
viduals and the average length of generation would be matters for 
estimation in this perennial plant that blooms only under favorable 
conditions. There are questions also on the effective mutation rate 
which will be taken up later. 


1Paper No. 754 from the Department of Genetics, University of Wisconsin. 
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The analysis indicated that only about 13 alleles would be main- 
tained in a random breeding population of 500 and only about 19 in 
one of 1000 by a uniform effective mutation rate of 10~° per generation 
from each allele to all others collectively, assuming no limit to the 
possible number. It would require a rate of about 10~*** per generation 
to maintain 45 alleles in a population of 500, or of 10~*** in one of 1000. 

The hypothesis of high mutability seemed to be ruled out by 
Emerson’s failure to find a single mutant in 45,000 pollen grains. 
Various other possibilities were suggested [11]: “One possible ex- 
planation would be that the size of the population is much greater 
than estimated from the data now at hand or, if not greater now, that 
it has recently been much greater, with loss of alleles at too slow a 
rate to have reached equilibrium. Inspection of Figure 2 indicates 
that a population of some 4,000 to 5,000 would be required to account 
for the probable number of alleles. Another possible explanation 
would be that some alleles are much more unstable than others. An 
average rate greater than 10~* seems, however, improbable. Finally, 
there is the possibility that the large number of alleles is a consequence 
of local inbreeding.” 

The last possibility was explored theoretically. The hypothesis that 
the species is subdivided into completely isolated groups of about 50 
individuals (each of which would maintain ‘ive or six alleles, all different 
from those in other colonies) was at once ruled out by Emerson’s 
observation that many alleles were common to two or three of the four 
canyons that he studied. There was, indeed, significant differentiation 
among the localities but the hypothesis of subdivision into partially 
isolated groups of as many as fifty individuals could also be ruled out. 
It was concluded under this head [11] that “the only possible interpre- 
tation, accepting 500 as an estimate of the total population, assuming 
mutation rates of less than 10~* and assuming equilibrium, seems to 
be along the line of a finer subdivision. ... It appears that if plants are 
pollinated in some 98 per cent or more of the cases by their immediate 
neighbors and only 2 per cent or less by a random sample of pollen from 
the species as a whole, it would be possible for a species of only 500 
individuals to maintain 40 or 50 alleles by mutation rates of the order of 
10° to 10~° per generation.”’ These, of course, are very extreme con- 
ditions. 

More recently, Lewis [6] has made tests of mutation rate in 
Oenothera organensis on a grand scale. He found that a mutational 
loss of the pollen reaction, not associated with change in the stylar 
reaction, tended to occur in about 10° cell divisions but he did not find 
a single mutation of the type of the natural alleles in 220 X 10° cell 
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divisions, untreated, or in 3 X 10° under X-ray treatment. He notes 
indeed that the effective mutation rate (proportion of mutant indi- 
viduals) would be expected to be greater than the frequency in pollen 
grains if the styles are exposed to large quantities of self pollen. He 
finds that there is room for some 5,000 pollen grains on the stigma. If 
the average proportion of incompatible pollen on the stigma is x, the 
frequency of mutations in pollen grains must be divided by (1 — 2) to 
give effective wu. Thus if 99 per cent of the pollen is incompatible, effec- 
tive u is 100-fold greater than the rate in unscreened pollen. An effective 
rate of 10~° is thus not wholly ruled out. If, however, effective u is as 
low as 107°, the minimum required size of population under the con- 
ditions assumed above becomes about 10,000. The hypothesis of fine 
scaled subdivision of a population in equilibrium at the present estimated 
size becomes even more improbable than before as the sole explanation 
of the number of alleles. A combination hypothesis, recent reduction 
from a population several times as large as at present, aided by partial 
isolation of colonies, seems the most plausible conclusion under the 
theory of the numerical relations that was presented. 


FISHER’S CRITICISM 


Fisher [5] has recently made an analysis that has led him to the 
conclusion that this theory is not correct. He arrives at numerical 
relations that increase still further the difficulty of interpretation, 
which he puts exclusively on the basis of recent reduction in size of 
population. The final paragraph of his account is as follows: 

“Tt may be added that the treatment of this case by 8. Wright 
leads to results, as shown by his graphs, very different from those 
obtained above. Wright, however, fails to develop any explicit formulae, 
but seems to have relied on extensive numerical calculations based on 
trial values of the numerous constants he introduces. It is hoped that 
the foregoing discussion, using the method of the 1930 edition, will 
set the situation of these alleles in a clearer light.’ 

Fisher’s analysis differs in nearly all respects: in the formulae for 
rate of change of gene frequency, for the sampling variance, for the 
distribution of gene frequencies, for the rate of loss of alleles and for 
the number of alleles. 

He has made a definite advance in using the exact formula for the 
sampling variance, but on substituting this for my approximation, it 
turns out that this causes no appreciable change in the graphs in the 
1939 paper, including that on the relation of number of alleles to 
mutation rate and size of population. In other respects, he uses alterna- 
tive approximations or introduces approximate integrations where 
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I used less elegant but possibly more reliable quadratures. He avoids 
the tedious iteration process required to make the mutation rate, 
required to offset extinction of alleles, agree with mutation rate as a 
factor in the rate of decrease of gene frequencies, by ignoring the latter. 
This is, indeed, important only at unrealistically high mutation rates, 
but in one of his two examples u = 10°*. Iteration is all important 
in the analogous situation that arises in investigating the effects of 
the partial isolation of colonies, which he does not deal with at all. 

As he does not locate the significant causes of the very different 
results to which he refers, it seems necessary to examine each step if 
the theory of this interesting phenomenon is not to be left in confusion. 


RATE OF CHANGE OF GENE FREQUENCY 


__ Analysis depends on obtaining formulae for the mean rate of change 
Aq of gene frequency q and for the sampling variance o4, in order to 
deduce the equilibrium distribution ¢(q) of gene frequencies under the 
specified conditions. 

We will consider first the rate of change of gene frequency which 
remains the least satisfactory part of the theory. The approximate 
method used in the 1939 paper was described as follows: 

“We assume the existence of a series of n self-sterility alleles, 
4: » G2 *** Qn Such that Zq = 1, ina population of N diploid individuals. 
The frequencies of zygotes containing one of these (S;) must be 2q¢ 
(with the appropriate subscript) since all are heterozygotes. The 
frequency of functioning S; female gametes is g, assuming no differential 
selection. The frequency of functioning S, pollen grains is not in 
general the same. S; pollen has, by hypothesis, no chance of function- 
ing in the styles of zygotes containing S; , but has a better than average 
chance in zygotes that lack S; (frequency (1 —2q)), since each zygote 
of this class inhibits pollen of two of the other kinds. Assume that 
on non-S; styles the ratio of successful S; pollen grains to successful 
ones of the other types is as g: R(1 — q). The total frequency of function- 
ing S; pollen is then g(1 — 2q)/[¢g + R(1 — q)]. The average frequency 
of functioning S; gametes is g(1 — q) (1 + R)/2[q + R (1 — g)] and the 
change from the previous generation is therefore: 


If n = 3, R is obviously zero and Ag reduces to — 3/2 [q¢ — 1/3] 
exactly. Otherwise, R varies among the alleles but approaches uni- 
formity either as the number of alleles increases or as their frequencies 
cluster more closely about the equilibrium point. It was assumed that 
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an adequate approximation could be obtained by assuming constancy 
of R. Where this is valid, the gene frequency at which Aq = 0 is given 
by @ = (1 — R)/(8 — R). This led to the equivalent approximate 
formula: 


Aq = 39 + 


The term 2¢q is so small with large n that the estimate of Ag would 
be little changed by replacing it by 24’, giving the more convenient 
approximation 


Formulae 1 and 2 were retained, however, because they reduce to 
the exact formula if n = 3, @ = 1/3, and were then thought to be 
slightly more accurate for n greater than 3. 

There is, of course, no difficulty in calculating the exact change 
in gene frequency due to self incompatibility if the set of zygotic fre- 
quencies q;; is given. In the following, the summation relates to all 
pairs of different alleles (iS; , S,) of the one considered (S;). 


_ (——)| 
Aq: 2 1 1 qi (4) 
Fisher, who follows essentially this approach, holds that a good 
approximation for Aq; can be obtained by replacing g; and q, by the 
equilibrium value @ (or a in his notation). If this is done, (4) reduces 
to the following, noting that 2q;, = 1 — 2q;. Using q for g; as above: 


ag, 0. (5) 


This differs from (3) only in lacking the factor (1 — @) in the denomi- 
nator. 

We have noted that (2) reduces to the exact formula if n = 3. 
Formulae (3) and (5) are both far from accurate in this case but this 
is of little importance since the gene frequencies are not allowed to 
deviate appreciably from 1/3 in any population of appreciable size. 
The inaccuracies of all of the formulae become more important if n = 4 
but decrease in importance as n increases. Pending the development of 
a more complete theory, it is useful to compare the three estimates 
(2), (5) and (3) with the exact values of Ag for a number of specific 
sets of zygotic frequencies: Tables 1 and 2 show such comparisons for 
n = 4and n = 5 respectively. The value of @ is derived from the con- 
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dition 2(Aq) = 0. It is given by @ = %q’ in (5) and (3) but requires 
solution of a polynomial in (2). The square roots of the mean square 
differences RMS from exact Aq are given as a basis for judgment of 
the closeness of approximation. 

These distributions are not, of course, equilibrium distributions 
at any specifiable size of population and @ is thus not a permanent 
constant. Some of them resemble equilibrium distributions in being 


TABLE 1 


ARBITRARY SETs OF ZyGoTic FREQUENCIES 
AND THE IMPLIED CHANGES OF GENE FREQUENCIES 


Change of gene frequency (A¢) 
Zygotic Gene (4) (2) (5) (3) 
frequencies (q.s) | frequencies (@i) Exact (1939) (Fisher 1958) (New) 
.27786 .295 
SiS: 
SiSs -10 + .0600 +.0801 +.0476 + .0675 
SiS -10 25 + .0346 + .0228 +.0274 -- .0389 
S283 -10 Qs 25 + .0346 + .0228 + .0274 + .0389 
.40 —.1292 — .1257 —.1024 — .1453 
S38 .85 
RMS 0132 .0156 .0094 
.2803 .30 .30 
SiS: .04 
SiSs qu -10 +.0656 + .0838 + .0500 +.0714 
SiS ll Qe -20 +.0770 +.0592 +.0500 +.0714 
8S: .30 —.0147 —.0181 0 0 
25 .40 —.1279 —.1249 —.1000 —.1428 
44 
RMS 0129 0222 .0112 
4 .27213 28 28 
.066 
SiSs .066 q 10 +.0500 +.0723 + .0409 +.0568 
SiSs .30 — .0166 — .0241 — .0136 — .0189 
S283 266 Qs .30 — .0166 — .0241 — .0136 — .0189 
SS 30 — .0166 — .0241 — .0136 — .0189 
.266 
RMS 0129 -0053 0039 
26625 .28 .28 
SiS2 066 
SiSs .066 a 20 +.0444 +.0431 + .0364 +.0505 
. 266 .20 +.0444 +.0431 +.0364 +.0505 
S2Ss .066  .20 + .0444 +.0431 + .0364 +.0505 
-266 a 40 — .1333 —.1292 —.1091 —.1515 
266 
RMS .0023 .0139 0105 


(4): The exact changes in gene frequency to the next generation and the three approximations, 
(2): 1939 formula, (5): Fisher’s 1958 formula, (3): new formula. RMS: square roet of mean square 
deviation from exact change. 
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symmetrical or positively skewed while others are very far from equi- 
librium distributions in form. 

We note that, in all but two of the eight cases, the new approxi- 
mation (3) shows the smallest differences from the exact values given 


TABLE 2 


ARBITRARY SETs oF ZyGotic FREQUENCIES 
AND THE IMPLIED CHANGES OF GENE FREQUENCIES 


Change of gene frequency (Aq) 
Gene (5) 
frequencies (4) (2) Fisher (3) 
Zygotic frequencies Exact 1939 1958 New 
qd -23671 +245 
SiS: -02 | .08 
02 S3Ss +22 -05 +.0217 + .0298 +.0191 +.0253 
SiS -02 -22 -20 +.0231 +.0191 +.0176 + .0234 
SiSs .04 qs -20 +.0231 +.0191 +.0176 + .0234 
S283 u -20 + .0231 +.0191 +.0176 + .0234 
S2Se .08 .35 —.0910 — .0870 — .0720 — .0954 
S2Ss 22 
RMS -0051 -0096 .0026 
é 20415 .205 205 
SiS2 -066 | SiS -100 
SiSs | 133 | -15 | +.0170 +.0181 +.0140 +.0176 
SiS .066 | SiSs 133 | -20 | +.0023 + .0018 +.0017 +.0021 
SiSs -100 -20 + .0023 +.0018 + .0017 +.0021 
-100 -20 + .0023 +.0018 + .0017 +.0021 
-110 -25 — .0240 — .0234 —.0191 — .0240 
S2Ss 133 
RMS 0007 0026 0003 
-23222 24 
SiS: .02 .16 
.04 -16 -10 + .0340 + .0378 + .0269 + .0354 
SiS .07 SiSs .30 .10 + .0340 + .0378 + .0269 + .0354 
SiSs .07 -20 + .0242 + .0163 + .0154 + .0202 
S2Sa .04 a .30 — .0461 — .0459 — .0346 — .0455 
SS .07 @ .30 — .0461 — .0459 — .0346 — .0455 
S2Ss .07 
RMS .0043 .0094 -0020 
q -22858 -2366 2366 
SiS: .02 -10 
.05 S3Ss .24 -1l + .0330 + .0358 + .0264 + .0346 
.05 SiSs .24 | + .0330 + .0358 + .0264 + .0346 
SiSs .10 -22 + .0092 + .0046 + .0069 + .0091 
S283 -05 22 + .0092 + .0046 + .0069 +.0091 
SS .05 @ .34 — .0845 — .0807 — .0667 — .0874 
S2Ss .10 
RMS -0038 .0091 -0016 


(4): The exact changes to the next generation, and three approximations: (2): 1939 formula, 
(5): Fisher’s 1958 formula, (3): new formula. RMS: square root of mean square deviation from 
exact change. 
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under (4), the 1939 formula (2) comes next and Fisher’s formula, (5), 
is third. The exceptional cases are the third and fourth in Table 1. 
In the former, in which all but one of the alleles have the same fre- 
quency, and this is relatively low, (3) is still the best but formulae 
(2) and (5) change places. In extreme cases of this type (all gene 
frequencies the same except one that is very small) Fisher’s formula 
(5) approaches exactness and hence becomes superior to (3) while formula 
(2) becomes wholly inadequate. The fourth case in Table 1 is also one 
in which all but one of the frequencies are the same, but that one has 
the highest instead of the lowest frequency. In this case formula (2) 
gives much the closest result and (5) is the poorest. In extreme cases 
of this type, however, (all or nearly all individuals heterozygous for 
one allele, the other alleles equally frequent) both (2) and (5) approach 
exactness, the former more rapidly. The above statements apply to 
extreme cases of the two types indicated, irrespective of number of 
alleles. 

The cases in which the new approximation (3) loses its superiority 
are the ones that are farthest from the form at equilibrium. The 
conclusion seems warranted that in distributions of the equilibrium 
type this approximation (3) is more accurate than that used in the 1939 
paper, (2), which, however, is more accurate than that proposed by 
Fisher, (5). As formula (3) seems to be the limiting value for large 
n and for close clustering of gene frequencies about the equilibrium 
point and is also more convenient than (2), it will be adopted here for 
all values of n greater than 3. 

The differences due to my use of (2) in the 1939 paper and Fisher’s 
use of (5) for Ag become so slight with large n that they can hardly be 
responsible for the large differences between the results which the 
latter finds and it is necessary to look farther. 


THE SAMPLING VARIANCE 


The conventional sampling variance for an array of 2N gametes, 
os = g(1 — g)/2N, was used in the 1939 paper. It was recognized 
that it was only an approximation in this case but with random sampling 
among N ovules and an approach to this among the N pollen grains, at 
least with a large number of alleles, it was thought to be adequate 
for numerical results. The exact sampling variance, given by Fisher, 
is, however, just as simple and as convenient in deriving the formula 
for the distribution g(q). The sampling variance for the class of zygotes 
carrying a given allele S; (frequency 2q;) is 2g; (1 — 2q:)/N. The 
sampling variance for g; is just one fourth of this. 


ok, = — 2q)/2N. 6) 
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It is obvious that the conventional formula differs little, if g; is 
small, as it must always be in equilibrium distributions with many 
alleles, but may be seriously in error for the large values of g; expected 
if the number of alleles is small. The seriousness of the errors where 
n is small will be considered later. Meanwhile as already noted it does 
not appear that the use of the incorrect formula for o4, in the 1939 
paper can be the cause of the large difference in results which Fisher 
finds in populations in which the equilibrium value of q is small. 


THE DISTRIBUTION OF GENE FREQUENCIES 


We come now to the distributions of gene frequencies. The final 
sentence of Fisher’s account, quoted above, seems to imply that the 
main cause of the differences in results comes here. 

My method was merely to substitute Aq and o4, in the relatively 
simple general formula for the steady state. 


ola) = exp| 2 aa. @ 


This formula was derived first [10] as that under which the mean 
and variance remain the same after the occurrence of both directed and 
random changes, measured by Ag and o4, respectively. As shown later 
[13], all moments (all necessarily finite) remain the same provided that 
the directed Ag and random 6g changes are so small that terms in 
and higher powers may be omitted in the ex- 
pansion of the basic equation (8) (in which n defines the moment in 
question, not the number of alleles). i 


a=0 


Integration is substituted for summation in deriving the formula. 

It has also been shown [12] that (7) is the steady-state solution 
of the Fokker-Planck equation used in physics in dealing with joint 
effects of directed and random motion. _ 

In the case of n = 3, in which both Ag and o4, are known exactly 
as indicated above, substitution in (7) gives 


This replaces the formula given on page 540 of the 1939 paper. The 
mean, 
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is unchanged. The variance, 
1 


comes out 1/(54N + 18) or approximately 1/54N. The 1939 result 
was 2/(54N + 9) or approximately twice as great, as expected from 
the two-fold sampling variance in the neighborhood of the mean in 
this case. It is, however, so small (only about one third the sampling 
variance) as to be of no importance. With Aq for any value of gq, 
overshooting the mean by 50 per cent, there can be no appreciable 
random drift. Even with n = 4, @ = .25, gene frequency tends to return 
more than two thirds of the way toward equilibrium in each generation 
from deviations close to the mean, and thus random drift is expected 
to be small unless the population is exceedingly small. aree 

In the 1939 paper, the following expression was used for Aq includ- 
ing, in addition to the selection term (1), terms for mutation to (rate v) 
and from (rate ~) the allele in question and, when dealing with a colony, 
replacement to the extent m by a sample representative of the species 
as a whole (gene frequency q,). 


+ — R)] (10) 


— +ut — + + — 9). 


Substitution of this and the value of o4, that was used gave the 
following which is explicit except for the constant C: 


= CIR + — 


(11) 


Fisher does not go into the effects of subdivision of the species. 
Thus m must be treated as zero for comparison with his results. The 


presence of v in the formula was required in a study of the effects of 


limitation in the possible number of alleles, also not considered by 
Fisher. If the possible number is much larger than actually found, v 
may be treated as negligible and dropped from the formula. This leaves 
only the term —vwq in addition to the selection term in (10), and this 
may be dropped if the rate from each allele to all others collectively 
is as small at it actually seems to be. 

If now we take Ag = —kq(q — @) — ug in which 


k= 1/1 — — 24) 
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in the formula that I prefer, but k = 1/(1 — 24) and u = 0 for compari- 
son with Fisher’s results, and use the exact sampling variance, 
= — 2q)/2N, substitution in (7) yields: 


¢(q) Cq'(1 (12) 


in which the constant factor C will be discussed later. 

Fisher, following a method that he used in the 1930 edition of his 
book [4], obtains a differential equation for the flux due to selection. 
In his notation p, a and y correspond to g, @ and ¢(q) here. He makes 
use in part of a transformation of scale, 2p = sin (8/2), designed to 
make the sampling variance uniform. This, however, requires a correc- 
tion term for effect on the mean, the second term in (13), the absence of 
which led to erroneous results in his first presentation [3] of the method 
in 1922. He introduced it into his book after comparison in 1929 with 
results [9], then in manuscript, that I had reached by use of an integral 
equation that did not involve the above transformation of scale and 
which yielded terminal conditions, verified in other ways, which did 
not agree with his earlier results: 


He arrives at the following as the element of frequency on the 
scale of gene frequencies. 


On replacing p by q, 1/(1 — 2a) by k, and collecting the constant 
terms, it may be seen that this is identical with (12), if 2Nw in the 
latter is negligible. Thus the supposed difference in the form of the 
distribution of gene frequencies with given Aq and o4, cannot be the 
cause of the differences in the numerical results. 


NUMBER OF ALLELES AND RATE OF TURNOVER 


The discrete steady-state distribution of which (7) is a continuous 
approximation, includes a term P(0), the probability that the gene in 
question is absent. It was shown [9] that P(Q) is ordinarily related 
to the probability P(1/2N) for a single representative of the gene 
in an isolated population by the approximate formula 2Nv P(0) = 
(1/2)P(1/2N) which expresses the balance between mutation and loss 
by accidents of sampling. If the mutation rate v to the gene from the 
array of alleles is very small, as expected where the number of possible 
alleles is very great, any particular allele is likely to be absent most of 
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the time. As formula (7) and hence (12) breakdown at qg = 0, we must 
practically restrict attention to the probability distribution of alleles 
when present, 


in which f(q) is the probability in this sense, f(q) = P(g)/[1 — P(0)]. 
Correspondingly we interpret ¢(q) in (7) and (12) as the ordinate of the 
probability distribution in this restricted sense, f(q) = ¢(q)/2N. As 
(12) is not integrable by formula, the relative values of ¢(q) were 
calculated at appropriate intervals and the sum was found, with due 
regard to the number of classes for which each calculated ordinate was 
representative. The reciprocal of this sum gives the constant C. The 
mean gene frequency was calculated from this empirical distribution. 
The number of alleles is, of course, the reciprocal of the mean, 


n=1/G@=1/D d@. (15) 


The average probability of loss of a gene in terms of the probability 
distribution, =f(q) = 1, is as indicated above, (1/2)f(1/2N) in the 
ordinary case in which the effect of selection on very low gene fre- 
quencies is negligible. 

The terminal conditions are, however, peculiar in the case of self- 
incompatibility alleles. In an artificial population consisting wholly 
of genotype S,S, except for one individual S,S,; , the next generation 
consists wholly of S,S; and S,S;. Thus qg, jumps in one generation from 
the lowest possible value, if present at all, 1/2N, to the highest possible 
value .50. The effect of selection is, of course, much less if there are 
more than three alleles present, but is still not as negligible as usual. 
Treating ¢ as roughly equal to g and hence 1/n, either (2) or (3) reduces 
roughly to Aq = q/(n — 3) if q is small. The chance of fixation from 
this class, which would be e~*™* in the absence of this selection, becomes 
ftor selection in the styles. The factor (n — 3)/(n — 1) 
was arrived at in the 1939 paper as roughly that which should be applied 
to the usual formula (1/2)f(1/2N) to give the total rate of loss per 
gene. More accurately, this rate of loss comes out .16 f(1/2N) for 
n = 4, .28 f(1/2N) for n = 5, .37 f(1/2N) for n = 6 3(n — 4.25) 
f(1/2N)/(n — 3) for larger n. These probably exaggerate the effect 
since while they apply to sampling following selection (as among the 
pollen tubes growing on a style) they do not apply to sampling pre- 
ceding selection (as among plants with respect to flowering). The 
rate of loss per allele must be multiplied by n to give the number of 
alleles lost per generation (n/2)f(1/2N) as the upper limit. There is 
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equilibrium if loss is balanced by mutation from the array of alleles 
on hand ud [2Nf(q)] = 2Nu. In the cases discussed later both the 
upper and lower limits are given for the effective mutation rate per 
gene required to balance losses: 


nf(1/2N)/4N 
[(n — 4.25)/(n — 3)]u, if n is large. 


The estimate of mutation rate required to balance losses must 
agree with the value of wu used in (12) if large enough to make an appre- 
ciable effect in the latter. This requires an iteration process. Similar 
considerations apply if there is introduction from without. 

Fisher’s reference to the absence of explicit formula in the 1939 
paper seems to be based on my use of quadratures in calculating C 
and hence n and wu (and also m in the study of population structure). 
His statement that I relied on extensive numerical calculations based 
on trial values of numerous constants seems to refer to this and to the 
iteration required to make the equation balance in cases in which u 
(and m) were not negligibly small. He evades these complications by 
not dealing at all with the problem presented by the partial isolation 
of the colonies in a few sufficiently moist canyons separated by miles 
of high mountains, and by not balancing his equation for equilibrium 
when dealing with high mutation rates and finally by not calculating 
the number of alleles n except very roughly by the reciprocal of the 
equilibrium value (instead of by that of the mean). 

He interprets (14) as the frequency distribution of the array of 
alleles instead of probability array for any one allele. Letting f’(q) 
be the frequency of alleles in a discrete class, 2f'(¢g) = n. Calculation 
of n by this formula would, however, require the sort of empirical 
integration to which he seems to object. He does, however, impose 
the condition 2(qg) = 1, or fige’(qg)dq = 1 in terms of the continuous 
curve. The multiplication by q cancels the troublesome term g™' in 
(12) or p"' in (14). By transforming essentially to the scalex = 1 — 2p, 
he obtains an expression that is in the form of the Eulerian definite 
integral that defines the gamma function except that its upper limit is 
not infinite. He shows, however, that the values of x near the upper 
limit (corresponding to the small values of qg) are such that treatment 
of the upper limit as infinite should usually give a good approximation. 
He is thus able to obtain an explicit approximate formula for B from 
which the terminal frequency at g = (1/2N) is found to be: 
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which he states “to a sufficient approximation may be simplified to 


(18) 


the half of which must represent the number of new mutations required 
on the average in each generation”. 

Formula (18) is divided by 4N to give the mutation rate required 
to balance losses. This corresponds to the upper limit in (16). 


REEXAMINATION OF 1939 GRAPHS 


The conditions assumed by Fisher in his two examples are different 
from any in the 1939 paper. To throw light on his statement that the 
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FIGURE 1 


THEORETICAL DISTRIBUTION OF FREQUENCIES OF SELF-INCOMPATIBILITY ALLELES 
N = 50,u = .01,¢@ = .10314. Data of Table 3. Abscissas: number of representa- 
tives in population; ordinates: frequencies [)_f(q¢) = 1]. Connected points = D, 
the preferred estimate; 0 = A (same as in Figure 3, 1939); X = B (Fisher’s formulae); 
+ = C (u ignored in calculation). 
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results shown in my graphs are very different from those that he obtains, 
I have recalculated the distribution in three of them with variously 
altered Aq and o4, . 

The case N = 500, R = .906 (or @ = .04489), n = 33.76, u = .001 
(Fig. 7 in the 1939 paper) was closest to one of those given by Fisher 
(N = 1000, @ = .03, n about 35, uw = .001). Recalculation using the 
correct sampling variance and the now preferred value of Aq gave a 
result that did not differ appreciably from the 1939 graph (revised 
n = 33.24, u = .00097). 

We will go into more detail in two cases chosen as those in which 
the effects of the changes in the formulae for Aq and o4, should be 
greatest. 

Table 3 and Figure 1 deal with the case N = 50, R = .77 (org = 
10314), u = .01 (Fig. 3, [11]). There is an appreciable difference 
between the 1939 results and those derived (B) from Fisher’s formulae 
for Aq and o4, , but not ones that are great enough to be of any import- 
ance in interpreting a possible case in nature (n = 14.75 vs. 13.52, 
uz, = .0101 vs. .0081). The conditions differ in three respects [initially 
assumed u (.01 vs. 0), k and o4,]. Comparison of the 1939 result (A) 
with the one preferred now (D) which practically differs only in the 
sampling variance, indicates that use of the correct formula has very 
little effect (n = 14.50 instead of 14.75, uz = .0093 instead of .0101). 
The slight difference in k can have little effect with n moderately large. 
On comparing (D) with (C) which differs only in the omission of the 
mutation term for Aq we find that this makes a somewhat greater 
difference (n = 13.36 in C instead of 14.75, uz, = .0074 instead of 
.0093). The distribution (B) based on Fisher’s formula which differs 
from (C) only in k gives results that differ very little. The inclusion 
or exclusion of the mutation terms in Aq is clearly more important than 
any of the other differences at this high mutation rate which, however, 
is out of the question in nature, judging from Lewis’ results. 

Table 4 and Fig. 2 deal with case N = 50, R = .49 (or @ = .20319) 
and mutation rate so low that it can be ignored as a cause of decrease 
in gene frequency. In spite of the fact that the number of alleles is 
here very small (estimates ranging from 5.124 to 5.195), there is little 
difference in appearance between the distribution (B) derived from 
Fisher’s formulae and that exhibited in the 1939 paper (Fig. 7). There 
is, however, a 6.5-fold difference in frequency of the class with one 
representative (.00325 in B, .00050 in A) and hence in the mutation 
rate required to balance loss of alleles. This turns out to be due almost 
wholly to the difference in the denominator of the selection term of 
Aq, (1 — 24) in B, (1 — 34 + 2@q) in A, which becomes important with 
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ie) 5 10 is 20 25 30 35 


FIGURE 2 


THEORETICAL DISTRIBUTION OF FREQUENCIES OF SELF-INCOMPATIBILITY ALLELES 
N = 50, U = 10-8, @ = .20319. Dataof Table 4. Abscissas: number of representa- 
tives in population; ordinates; frequencies [)-f(q) = 1). Connected points = D, 
the preferred estimate; 0 = A (same as in Figure 5, 1939); X = B (Fisher’s formulae). 


small n and large g. We find nearly as great a difference between 
(D) and (B) which differ only in this respect and there is accordingly 
little difference in this respect between (D) and (A) which differ in the 
formula used for o4,. This difference in o4, is, however, reflected 
in a narrower distribution in D, as expected from using g(1 — 2q)/2N 
instead of g(1 — g)/2N with @ as large as .20. 

The remarkably similarity in the general appearance of (A) and 
(B) in contrast with D results from the compensatory effect of the 
differences in ¢4, and Aq in A and B. 

Summing up, the comparison of three of the cases given in the 1939 
paper including the two in which the greatest differences were antici- 
pated, with the results from Fisher’s formulae, and other estimates, 
has brought out only one difference of real importance, a six-fold greater 
mutation rate from using Fisher’s formula for Ag which becomes 
seriously inadequate where the number of alleles is as small as 5. We 
turn to the two new examples given by Fisher. 


FISHER’S EXAMPLES 


Table 5 and Fig. 3 deal with case NV = 1000, ¢ = .03. Fisher again 
does not include a mutation term in Ag but as he concludes that a 
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mutation rate of 10~* is necessary to balance losses, I use this figure 
in my calculations. For comparison with the 1939 method I have 
substituted 1/k = 1 — 34 + 24° for 1 — 34 + 24q since with ¢ = .03, 
the difference is negligible. The effect of substituting the correct 
formula for the sampling variance may be seen by comparing (D) with 
(A). The results differ in no important respect (n = 54.9 vs. 52.8, 
uz = 1.03 X 10~° vs. .94 X 10°°, graphs barely distinguishable). The 
effect of omitting the term —vg in Aq can be seen from comparison 
of C with D. It is slightly more important (n = 52.3 vs. 54.9, u, = 
91 X 107° vs. 1.03 X 107°). Finally the effect of using (1 — 29) in 
Aq instead of (1 — @) (1 — 2@) can be seen by comparing (B) with (C). 
It is clearly of no importance in this case (n = 52.8 vs. 52.3, u, = 
.95 X 107° vs. .91 X 107°). The formulae underlying (B) differ from 
those underlying (A) in all three respects but because of compensatory 
effects there is virtually no difference whatever in the graphs or in 
any of the deduced parameters (n = 52.78 vs. 52.78, u, = .95 X 10°° 
vs. .94 X 107°). It should be emphasized that we have made our 
calculations for (B) (Fisher’s formulae) in exactly the same way as in 
the other cases, i.e. from calculation of ordinates at appropriate intervals, 
including that at gq = 1/2N, as a basis for estimating loss of alleles 
and the mutation rate to balance these, and n as the reciprocal of the 
mean. The distribution (B) seems to be exactly the same as that shown 
on page 110 in Fisher’s 1958 edition. There is essential agreement in 
the required mutation rate (“about one in a thousand’’). There is 
radical disagreement, however, with his statement on the number of 
alleles (“about 37 alleles’ on page 109, ‘‘about 35 alleles” on page 110 
in contrast with 52.8 as the reciprocal of g, calculated from his own 
assumptions. 

Fisher’s estimate seems to be based on the supposition that n is 
only slightly greater than the reciprocal of @ which gives 33.3. 

A case could, perhaps, be made for citing the number of alleles as 
that for ones common enough to be likely to be found. The mean is 
much lower than the equilibrium point only if there is a large accumu- 
lation of alleles with very few representatives, as in this case. If ones 
that appear in only 17 or less of the 1000 zygotes are excluded, about 
35 are left in agreement with Fisher’s estimate, but the “presentation 
in a clearer light” claimed in the final sentence of his account would 
require that this shift from the meaning of “number of alleles’ used 
in the 1939 paper be stated, if this is what is meant. 

The apparent difference between his and my results is much greater 
in his other example in which N = 10,000 with @ again .03. In this 
case the mutation rate required to balance losses is so small that it 
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FIGURE 3 

THEORETICAL DISTRIBUTION OF FREQUENCIES OF SELF-INCOMPATIBILITY ALLELES 
N = 1000, 9 = .03, wu = .001. Data of Table 5. Abscissas: number of representa- 
tives in population in detail up to 7, by intervals of 5 from 10; ordinates: frequencies 
5f(q), (2 f(q) = 1). Curve through D, the preferred estimate; + = C (u ignored), 
X = B (Fisher, 1958, p. 110) or A (1939 formulae) which are indistinguishable on 
the scale used. X is omitted beyond 2Nq = 7 since intermediate between C and D 
(except from 30 to 45 where very close to the lower of these). 
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need not be introduced into the formula for Ag. The distributions from 
the 1939 formulae (A) (replacing 2¢¢ by 24’ in the denominator of Aq), 
from Fisher’s formulae (B) and from those that I now prefer (D) are 
given in Table 6 and Figure 4. It may be seen that there are no appre- 
ciable differences in the graphs. The three estimates of number of 
alleles are 34.22, 34.22 and 34.20 respectively and those given by Fisher 
“somewhat more than 35” on page 108 and “about 35” on page 110 
are not seriously different in this case in which g (= .0292) and 4 
(=.03) do not differ much. The estimates of the number of alleles lost 
per generation, (1/2)nf(1/2N), differ considerably more but still not very 
seriously by my calculations (.96 X 107’, 1.39 X 10°’ and.77 X 107’ 
respectively). Fisher’s estimate of this quantity “6.06 in a million 
generations” is 63 times as great as that derived by my 1939 formulae 
and seems to constitute the most impressive basis yet found for his 
conclusions. We note, however, that his estimate is 44 times as great 
as the estimate (B) which we arrive at from his own formulae for Aq, 
o,, and the distribution (which as given here under B in Figure 4) does 


TABLE 6 
THEORETICAL DISTRIBUTION OF SELF INCOMPATIBILITY ALLELES 


A 
40 f(Q) 


1.80 

x 107 -1619 
0.96 

1077 - 1646 
640 -1423 
-0001 680 -1015 
-0003 720 -0609 
-0010 760 -0308 
-0035 800 -0128 
-0100 840 -0045 
-0247 880 -0013 
-0515 920 
-0901 960 
+1325 1000 


20,000 
0.03 n 


0 nf(1/2N) 
1/k (1 — @) X 
(1 — 2g) | (1 — 24) Uw 


2Nokq @(l — — 29) 


N = 10,000, = .03. Four estimates differing in Aq and (lower left). , Deduced param- 
eters: lower right. Graphs: Figure 4. 
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| 
A B D P| B D 
z = 2Ng 40 f(@) 40 40 f(a) z= 40 40 f(a) 
1 | 2.25 3.24 
x 1077 x 1077 .1612 .1633 
1.20 1.71 
2 x 1077 x 1077 .1656 .1682 
.1420 .1439 
200 .0001 .0001 .1022 .1027 
| * 240 .0003 .0003 .0614 .0608 
280 .0012 .0307 .0298 
320 .0038 .0039 .0127 
| 360 .0108 .0109 .0044 .0040 
400 .0260 .0260 .0013 
1: 440 .0532 .0530 .0003 .0003 
480 .0910 .0909 .0001 .0001 
520 .1319 
2N 20,000 20,000 2.922 2.922 2.924 
.03 .03 34.23 34.22 34.20 
1.925 2.772 1.539 
x107 | «1077 | x 1077 
ee 4.8 6.9 3.8 
x 10-2 | x 1078 | 10-2 
| 4.6 6.7 3.7 
x10-# | | x 1078 
> 
: 


SELF-INCOMPATABILITY ALLELES 83 


not seem to differ in any respect from that figured on page 110 in his 
book. The above differences are, of course, reflected in the estimate 
of the mutation rates required to balance extinction, u, = 4.8 X 107”, 
6.9 X 10° and 3.8 X 10°” for A, B, D respectively while Fisher’s 
statement for the same distribution as B is “about one in three thousand 
million” or about 3.3 
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FIGURE 4 


THEORETICAL DISTRIBUTION OF FREQUENCIES OF SELF-INCOMPATIBILITY ALLELES 
N = 10,000, ¢ = .03, u = 4 X 10-*. Data of Table 6. Abscissas: number of 
representatives in population at intervals of 40; ordinates: 40f(q), -f(q) = 11. 
Curve through D (the preferred estimates); z = B (Fisher, 1958, p. 110) or A (1939 


formulae) which are indistinguishable on the scale used (except at 2Nq = 600 at 
which 0 = A). 


The difference depends on two things: (1) he used his simplified 
approximation (18) for the frequency at gq = 1/2N which in this case 
gives a result 4.4 times as great as from his own full formula (17), and 
(2) he has misplaced a decimal point in making the calculation. He 
gives e °**" = e~'* as .000,000,152 which is ten times its value. As 
to the former, his simplified formula is merely the first term in a series 
which must be carried to at least nine terms to give a reasonably good 
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approximation to his full formula in this case, (N = 10,000, a = .03). 
Formula (18) is not very bad in the preceding case with N = 1,000, 
a = .03. It may be added that the full formula (17) gives a very good 
approximation in condensed distributions but is not very good where 
there is a great accumulation of rare alleles. Direct calculation from 
the ordinate at q = 1/2N after determining C by quadrature seems 
safer, and the latter seems necessary in any case for calculation of n. 


CONCLUSIONS 


It is gratifying to find that in spite of differences in the formulae 
for rate of change of gene frequency and for the sampling variance, 
and in spite of different ways of deriving the distribution of gene fre- 
quencies and of calculating the number of alleles lost by accidents of 
sampling, there are no important difference between the numerical 
results presented in the 1939 paper and those derived from Fisher’s 
formulae, except for the mutation rate in one case, an extreme one with 
only about five alleles in which his approximation for rate of change 
of gene frequency becomes highly inadequate. Similarly the numerical 
results from the 1939 formulae give results that agree well with those 
from Fisher’s formulae in two new examples that he gives, after correct- 
ing three errors in his methods or computation. The most important 
residual differences are in cases with very high mutation rates and are 
due to his failure to introduce mutation into his formula for rate of 
change of gene frequency. 

We find that a modification of our 1939 formula for rate of change 
of gene frequency under selection gives a slightly better approximation, 
but that the 1939 formula is better than that used by Fisher. We 
accept Fisher’s exact formula for the sampling variance, but have 
found no case in which this changes our approximate results to an 
important extent. Changes are suggested in a factor presented in 1939 
to correct the estimates of loss of alleles for the unusually strong effect 
of selection at low frequencies in the case of self-incompatibility alleles. 

None of these changes affect the essential correctness of the 1939 
numerical results, including the graph showing the number of alleles 
expected in a closed random breeding population of given effective size 
at a given effective mutation rate. 
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TIES IN PAIRED-COMPARISON EXPERIMENTS USING A 
MODIFIED THURSTONE-MOSTELLER MODEL* 


W. A. GLENN anv H, A. Davin 
Virginia Polytechnic Institute, Blacksburg, Virginia, U.S. A. 


1. Introduction 


When making paired comparisons a judge frequently is unable to 
express any real preference in a number of the pairs he judges. Never- 
theless, some of the methods in current use do not permit the judge 
to declare a tie. In other cases ties are permitted, but are ignored in 
performing the analysis. Alternatively, the ties are sometimes divided, 
equally or randomly, between the tied members of a pair. 

In assessing the merits of these various procedures it is helpful to 
distinguish between hypothesis testing and estimation. Several authors 
have shown that tied observations should be omitted in tests for the 
equality of treatment means. For the sign test, which may be regarded 
as a non-subjective form of paired comparisons, Hemelrijk [6] has proved 
that ignoring ties makes for a more powerful test of the null hypothesis 
than that obtained by dividing them equally among the positive and 
negative observations. Putter [9] has shown that random allocation 
of the tied observations reduces both the exact power and the asymptotic 
efficiency of the sign test. In a more general context Tocher [11] has 
essentially shown that ties are better ignored in tests involving dis- 
continuous variates. A related question is whether ties should be 
permitted at all. This has been considered by Gridgeman [4] who has 
proposed a probabilistic model allowing for ties. He concludes that, 
in discrimination tests, power is theoretically increased by permitting 
the judge to declare a tie when unable to express a real preference, if 
the recorded ties are left out of consideration in analyzing the data. 
In practice, however, the increase in power may be offset by a decrease 
in the subject’s efficiency of decision. In such circumstances Gridgeman 
recommends the prohibition of ties. 


*Research supported, in part, by the Office of Ordnance Research, U. S. Army Contract No. 
DA-36-034-ORD-1527RD. 
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When the estimation of response-scale values is the objective, there 
is some evidence indicating that ties should be taken into account. In 
this connection Ferris [1] has commented that the information contained 
in the no-preference class has often been wasted or misused in dis- 
tributing the ties in order to reduce the data to the two figures repre- 
senting preferences for the items under comparison. 

Gridgeman [4] has given the opinion that in preference work ties 
should be admitted as they add information. At the same time, how- 
ever, he notes that a no-tie procedure is mandatory in methods such as 
that of Thurstone and Mosteller, which are set up to handle observed 
data given in a binomial form. 

Using a model of the Thurstone-Mosteller type (modified as de- 
scribed below), it is proposed in this paper to develop a method of 
estimating the relative strengths of treatment stimuli which makes 
provision for tied observations. 


2. The Proposed Model 


The paired-comparison experiment was introduced by Thurstone 
[10] for the purpose of estimating the relative strengths of treatment 
stimuli through subjective testing. He postulates a subjective con- 
tinuum over which sensations are jointly normally distributed with 
equal standard deviations and zero correlations between pairs. Mosteller 
[7] shows that the assumption of zero correlations may be relaxed to 
an assumption of equal correlations, with no change of method. Without 
further loss of generality we may let the scale of the sensation con- 
tinuum be so chosen that the difference of any two stimulus responses 
has unit variance. Let 6 denote the difference of the true stimulus 
responses of a pair of treatments. Then under this model the probability 
distribution of the difference of the two responses is normal with mean 
6 and unit variance. 

By assuming that all differences, however small, are perceptible, 
the Thurstone-Mosteller model prohibits the declaration of ties. Instead 
of this assumption it is postulated in this paper that when the difference 
between two responses lies below a certain threshold, the judge will be 
unable to detect it; that is, if the difference lies in an interval between 
— rand 7, the judge will declare a tie. Let us denote by 7, the prob- 
ability that a tie will be declared. One would expect the value of z, 
to depend on both 7 and 6. This dependence is illustrated in Figure 1. 
For r = 0.4 and r = 0.2 the broken lines show the variation of 7, with 
5, provided the distribution of the difference of the two responses is 
normal with mean 6 and unit variance. It is proposed later to replace 
the cumulative normal function by a sine function such that an inverse 
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-=-= based on cumulative normal function 


— based on sine function 


PROBABILITY OF A TIE, m 


DIFFERENCE OF TWO MEAN RESPONSES, § (arbitrary units) 
FIGURE 1 


VARIATION OF THE PROBABILITY OF A TIE (x;) WITH THE DIFFERENCE (6) OF THE 
Two Mean Stimu.us RESPONSES 


sine transformation can be used in scaling the response proportions. 
For + = 0.4 and r = 0.2 the solid lines in Figure 1 show the variation 
of x, with 6, when based on the sine function. 

Consider a paired-comparison experiment involving ¢ treatments, 
and let X; and X; be single responses of a judge to the 7th and jth 
stimuli. Let S; denote the true response to the ith stimulus 
(¢ = 1, --- ,¢). Then under the Thurstone-Mosteller model the prob- 
ability distribution of the difference X; — X,; (¢ ¥ j) is normal with 
mean S; — S; and unit variance. Let us define 


F(a) = Vin dy, (2.1) 


from which it is evident that 


F(—a) = 1 — F(a). ; (2.2) 
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Denoting treatments 7 and j by 7’; and 7’; respectively, we may write 
the probability that 7’; is preferred when 7’; and 7’; are compared as 


P(X; > X;) = F(S; — 8). (2.3) 


It is evident from (2.2) and (2.3) that the only admissible judgments 
in the Thurstone-Mosteller method are X; > X; and X; < X;. We 
propose to admit a third type of judgment, viz. no preference. This is 
accomplished by postulating that there exists an interval of length 27, 
centered at the origin of the distribution of X; — X; , within which the 
judge cannot distinguish between X; and X; , and will declare a tie. 
Accordingly we define the parameters 


= — X) > 7] = — S)), 
= P(X; — X) < = 1 +S; — 8)), 
and 
= P[| — X;| < r] = + — S) — + S; — S)), 


which are, in turn, the probabilities that 7; , 7; , or neither are pre- 
ferred in the comparison of 7; and 7; . This set of relations replaces 
(2.3) and by virtue of (2.2) may be expressed in the form 


H+ = F(r +8;- S;) 
and + = F(r — + 


Suppose observations are made in the comparison of 7; and T, , 
either by a single judge or by a group of judges having equal discrimi- 
natory powers relative to the stimuli concerned. Let the data recorded 
be 


(2.4) 


Di.is = N:.4;/n = proportion of preferences for 7’; , 
Pi.ii = ;.4;/N = proportion of preferences for 7’; , 


and 


= 


when 7’; and 7’; are compared, where 


proportion of ties, 


Replacing the parameters in (2.4) by their estimates, we have the 
relations 


Di. ii + = + S! 
and + Dos = — Si + 
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where 7/;;, and S; — S! are respectively the experimental values of 
rand S; — S; resulting from the comparison of 7; and T;. For brevity 
let us define . 


Di.ii + = 


(2.5) 
and Docs = , 
so that we may write 
+ — Sj = 
and tui — Si + 8} = F"(a,,), 


where F~'(a,;) and F~'(a;;) are discrete variates which may be called 
the pseudo-normal deviates exceeded with probabilities (1 —a,;) and 
(1 — a;;) respectively. Solving for 7/,;;, and S; — S/ we obtain 


Tun = + F(a;,)] 
and Si — 8} = — 


(2.6) 


Given data of this form for each of the pairs (7,7 = 1, --- , 4) we should 
like to determine least squares estimates of r and the S; . This is, 
however, rendered awkward by lack of independence, since it can be 
shown that asymptotically the covariance of r{,;, and S; — S} is 


[F~'(a,;)] — Var 
= (1 — + — 
— (1 — — S; + 


where f(a) denotes the normal ordinate at abscissa value a. 
The difficulty presented by lack of independence is overcome, at 
least in large samples, if we replace (2.1) by 


F(a) = 3 x cos y dy = 3(1 + sina), (2.7) 


where a represents an angle in radian measure such that —}3a <a < 32. 
The relations (2.2) and (2.4) still hold, and by following steps similar 
to those given above we find instead of (2.6) 

= (2a;; 1) sin” (2a;; 1)] (2.8) 
and 


Si — S} = (2a,, — 1) — sin” (2a,, — 1)], (2.9) 
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the angles on the right being expressed in radian measure. For large 
samples it may be shown that approximately 


Var [sin™’ (2a;; — 1)] = Var [sin™ (2a;; — 1)] = 1/n, 


while from (2.8) and (2.9) it is evident that the covariance of 7/;;, and 
St — S} is proportional to the difference of these two variances. Thus 
for large samples and S{; — are uncorrelated. 

Let us denote by p;; the correlation between sin™’ (2a;; — 1) and 
sin”' (2a;; — 1). For large samples it may be shown that approximately 


and hence that 
Var = (1 + pis)/2n (2.11) 
and 
Var (Si — = (1 — (2.12) 


For z,.:; # 0 these variances will not, in general, be homogeneous over 
all pairs (¢, 7 = 1, --- , t). However, in the absence of extreme com- 
parisons the departures from homogeneity will be relatively small. 
It is therefore to be expected that estimates obtained from an un- 
weighted least squares solution will serve as good first approximations 
to the results of a weighted analysis. The unweighted analysis is 
described in Section 3; the weighted analysis and an iterative procedure 
are given in Section 4. The procedure of Section 3 is used in determining 
initial estimates of the weights. 

The magnitude of the parameter 7 evidently depends on the ability 
of the judge or judges to detect small differences in the stimuli con- 
cerned. When the judges may be regarded as equally competent in 
this respect, or when a single judge is used, it is reasonable to postulate 
that a common value of 7 applies in all comparisons. If, however, we 
wish to allow for differences in the discriminatory powers of the judges, 
we must postulate a different 7 for each judge. Given a group of, say 
r, judges, one may wish to test whether it is appropriate to postulate 
that a common 7 is applicable. This may be done by presenting to 
each of the r judges n pairs made up from the material under test, in 
which the difference is known to or controlled by the experimenter. 
If the conditions of the experiment permit, the same set of pairs should 
be presented independently to each judge. Failing this, the sets should 
be made as nearly identical as possible. Each judge is asked to indicate 
the number of preferences which he has, and the number of no-prefer- 
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ences or ties, the results being recorded in the form of an r X 2 con- 
tingency table. In the controlled pairs the difference should be taken 
small enough and the value of n large enough to yield cell frequencies 
of at least five. The usual chi-square test for independence of the row 
and column classifications, in this case on r — 1 degrees of freedom, is 
equivalent to a test of the null hypothesis that the judges do not differ 
appreciably in their ability to detect differences in the stimuli under 
test. If the null hypothesis is rejected, one should postulate a different 
t for each judge; otherwise a common 7 may be used. Alternatively, if 
the null hypothesis is rejected, it may be evident from the data of the 
test that within certain subgroups of the judges there is homogeneity 
in ability to detect differences in the stimuli. In such a case it would 
be appropriate to postulate a different 7 for each subgroup. 


3. Unweighted Analysis of Balanced Experiments 


Consider a paired-comparison experiment involving ¢ treatments, 
t 
2 
for the moment, that a common value of 7 applies in all comparisons. 
In this section heterogeneity in the variances (2.11) and (2.12) is ignored, 
the purpose being to obtain initial estimates of the parameters 7 and 
S; (¢ = 1, --- ,#. For convenience in writing the data (2.8) and (2.9), 
let us define the abbreviations G;; , H,;; (¢ # 7) by 


in which n observations are made on each of the ( ) pairs. Weassume, 


= }[sin™ (2a;; — 1) + sin™ (2a;; — 1)] = Gi; 


, 
T (ii) 


(3.1) 


and 


(2a,, — 1) — (3.2) 


Si S! 


From the G,;; we obtain the least squares estimate of 7 as 7* such 
that 


Q = (r — 
is a minimum for r = r*, where >‘; represents the sum over all (;) 


pairs of integral values of 7 and j in which 7 < 7. Upon differentiating 
Q, with respect to 7 and equating the result to zero we have the solution 


— 1) . (3.3) 


i<i 


From the H,;; we determine least squares estimates of the 
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S,; (¢ = 2, +--+ , é), S, being taken as origin. These estimates, denoted 
by S* , are such that 


i<i 


is a minimum for S; = S* (¢ = 2, --- , é) and S* = 0. It will be con- 
venient to express the sum of squares Q, in matrix form. Let Y be a 


column vector of the (;) experimental H;; , i.e., 


and let B’ represent the 1 X (¢ — 1) row vector of S; , i.e., 
18, , Bi. 


Finally let X be a (;) X (¢ — 1) matrix consisting of 1’s, — 1’s, and 0’s 


2 
described as follows. Corresponding to each element in the vector Y, 
there is a row in X, while the columns of X may be regarded as associated 
respectively with the elements of the vector B’. The row corresponding 
to H;; has +1 in the column corresponding to S; and —1 in the column 
corresponding to S; . Since there is no column corresponding to S, , 
no entires are made in the X-matrix fori = 1. All elements not other- 
wise mentioned are zero. In terms of these matrices we may write 


Q. = (Y — XB)'(Y — XB), (3.4) 
giving the required vector of least squares estimates as 


B* = (X’X)"X’Y. (3.5) 


From the description of the matrix X it is evident that X’X, of 
order t — 1, has the form 


| 


=> t = 1 


-1 
of which the inverse is readily found to be 


2 1 


= 
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Upon pre-multiplying the vector Y by X’ and observing from (3.2) 
that H;; = — H,; , one obtains the (ft — 1) X 1 vector 


=| (3.7) 


where, for convenience, we have taken H;,; = 0. Substitution from 
(3.6) and (3.7) into (3.5) yields the solution 


St 
i=l 
st| = 1 2 1 | (3.8) 
i=l 
S* «+ 32 2, 
= = 4 


The unweighted analysis of balanced experiments when a different 
tr is postulated for each of r judges is considered in [3], Section 2.3. A 
parameter 7, is associated with the kth judge, k = 1, --- ,r. The 
estimate r* is found to be the mean of the G;; values pertaining to the 
kth judge. Any estimate S* (i = 2, --- , ¢) for the over-all experiment 
is shown to be the arithmetic mean of the estimates S* of the separate 
judges. The analysis therefore consists of applying the procedure of 
this section to the data for each judge, or group of judges for which a 
7, is postulated. The separate 7 and S* are obtained, the latter then 
being pooled as indicated above. 


4. Weighted Analysis of Balanced Experiments 


From (38.1) and (3.2) it is evident that we may write (2.11) and 
(2.12) in the form 


Var (G;;). = (1 + pi;)/(2n) (4.1) 
and 
Var (H;;) = (1 — pi;)/(2n). (4.2) 


In order to allow for heterogeneity in these variances, each G;; , H;; 
must be weighted in proportion to the inverse of its variance. This 
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requires estimates of the p;; . We obtain these estimates by first de- 
termining 7* and the S* from the unweighted analysis of Section 3. 
Let us define a% and a* by 


+ St — St = sin™' (2a% — 1) 


and 
— St + St = (2a% — 1), 


which in turn yield 


a* 


[1 + sin (7* + — S*4)] (4.3) 
and 


a; 


+ sin — S* + S4)]. 


It follows from (2.5) that a* and a* are respectively estimates of 
1 — m;.;; and 1 — 7,.;; based on r* and the S*. Letting r;; denote an 
estimate of p;; , we have from (2.10) 


— as)(1 — as) (4.5) 


(4.4) 


Instead of (4.1) and (4.2) we may write 
estimated Var (G,;) = (1 + r;;)/(2n) 


and 


estimated Var (H;;) = (1 — ri;)/(2n). 


Defining v;; as the weight associated with G;; and w,; as the weight 
associated with H;; , we shall take 


1/(1 + (4.6) 


and 


Il 


Wi; — 


(4.7) 


for all (.) pairs of integral values of 7 and j in which i < j. 


We shall denote the estimates obtained from the weighted analysis 
by 7** and S¥* (¢ = 2, --- , t), with S*¥* = 0. Corresponding to Q,, 
the quantity which is minimized in determining 7** is 


— G;;)’, 
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and leads to the solution 
In the determination of the S**, Q, is replaced by 
Q.. = > w.(S; — S; — H;;)’. 
In expressing the sum of squares Q,,, in matrix form we use the matrices 
Y, B, and X as defined in Section 3. In addition we require a diagonal 


matrix of weights, denoted by W, of order (; 


being w,,; as defined in (4.7), and written in the same order as the ele- 
ments in the vector Y. In terms of these matrices we may write 


Qow = (Y — XB)'W(Y — XB) 


), the diagonal elements 


and obtain 
B** = (X’WX)'X'WY, (4.9) 


as the vector of the S**. 
If, for convenience, we take w,;; = 0, it is evident from the defini- 
tions of X, W, and Y that we may write 


- om 
We; —We °°° —Wat 
j=1 
t 
i=1 
t 
—W31 cee | 
i=1 


and 


W2;H>; 


i=1 


i=l 


X'WY =| > |. (4.11) 


j=] 


One cannot write down a simple general form for the inverse of X’WX. 
However, since it is necessarily symmetric, the Doolittle method may 
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be used. If high-speed computing equipment is available, much com- 
putational labor may, of course, be avoided. 

Large-sample estimates of Var (r**) and of the variances and co- 
variances of the S**(z = 2, --- , ¢) are readily obtained. Using (4.6) 
we may write 


estimated Var (G;;) = 1/(2nv,;), 


which, in conjunction with (4.8) yields 


t t 2 t 
estimated Var (7**) = >> / = 1 / (2n 0), 
i<i i<i i<7 
since the comparisons on the pairs are made independently. Similarly, 
using (4.7) we may write 


estimated Var (H;;) = 1/(2nw;,,), 
so that 
estimated Var (w,;H;;) = w,;/(2n). 


Combining this result with (4.11), we have the estimated variance- 
covariance matrix associated with the vector X’'WY given by 
> = (X’WX)/(2n). 
Using (4.9) and the fact that (X’WX)~* is symmetric, we obtain the 
estimated variance-covariance matrix associated with the vector B** as 
= (X’WX)"! (X’WX)7' = (X’WX)7"/(2n). 
x’w 

If Var (G,;) and Var (H;;) are in fact quite homogeneous over all 
pairs of treatments in the experiment, the results of the weighted analysis 
of this section and the unweighted analysis of Section 3 will be in close 
agreement. If, however, r*, 7** and the respective S* and S** are not 
in close agreement, one may conclude that there is heterogeneity in 
Var (G,;) and in Var (H;;), and the weighted analysis is the jnore 
appropriate procedure. In such a case one may use the r** and the S** 
to determine an improved set of weights and repeat the procedure 
described above. This iterative technique may be carried through as 
many stages as required to give results on two successive stages which 
do not differ within the limits of desired accuracy. However, in apply- 
ing the technique to experimental data, we have found that unless there 
are some very extreme proportions present, only very minor changes 
in the estimates of 7 and the S, will be observed between the first and 
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second weighted solutions. Variances and covariances of the estimates 
should, of course, be obtained only at the last stage of iteration. 

The weighted analysis of balanced experiments when a different + 
is postulated for each of r judges is considered in [3], Section 2.4. The 
estimate 7#* associated with the kth judge (k = 1, --- , r) is found to 
be a weighted arithmetic mean of the G,; values pertaining to the kth 
judge, the form corresponding to (4.8). In determining the S** the 
matrices X‘WX and X’WY are shown to be respectively the sums of 
the matrices (4.10) and (4.11) for the separate judges. The comments 
made above regarding the inversion of X’WX, the variances and co- 
variances of the estimates, and the iterative procedure apply also to 
to this case. 


5. Analysis of Non-Balanced Experiments 
In Sections 3 and 4 it is assumed that n observations are made on 
each of the (:) possible pairs. We now consider the analysis when n,; 


observations are made in the comparison of 7; and 7’; , thus allowing 
for unequal numbers of observations on the pairs. As a special case of 
unequal numbers, if there should be no observations on certain of the 
pairs, the corresponding n;; values are zero. 

As in Section 3, we first obtain preliminary estimates by ignoring 
the heterogeneity in the p;; . However, when n is replaced by n,; it 
is evident from (4.1) and (4.2) that heterogeneity in the variances of 
G,; and H,; arises as a result of the unequal numbers. Thus the pre- 
liminary estimates must be obtained from a weighted analysis in which, 
instead of (4.6) and (4.7), we take 


0; = Wy; = Ny; (5.1) 


In the relations (4.8), (4.10), and (4.11) we simply replace v,; and w,; 
by 

Having obtained the initial estimates, we determine the r,;; from 
(4.5) and carry out a weighted analysis in which, instead of (4.6) and 
(4.7), we take 

= + (5.2) 
and 
wi = — (5.3) 


Except for these definitions of v;; and w,; , the procedure is identical 
to that of Section 4. : 
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It is easily shown that 
t 
estimated Var (r**) = i/(2 
and 


= (X’'Wx)"/2, 


Bee 


when n is replaced by n;; and v;; , w;; are as defined in (5.2) and (5.3). 
For small values of n;; , these variances and covariances will be relatively 
larger than those based on large samples. The absence of data on some 
of the pairs leads to zero values of the corresponding n,; , and hence 
of v;; and w,; . However, the solution will exist provided X’WX has 
rank ¢t — 1. 

The iterative procedure described in Section 4 applies also to the 
analysis of non-balanced experiments. Similarly, no new problem 
is faced when a different 7 is postulated for each of r judges, who make 
unequal numbers of observations on the various pairs. 


6. Testing the Validity of the Model 


As a test of the goodness of fit of the proposed model, we compare 
the observed numbers in each category with the expected numbers 
derived from the solution. If the discrepancies are small we consider 
the solution to be internally consistent. For the binomial situation in 
which ties are not admitted, a test of this kind has been proposed by 
Mosteller [8]. When ties are admitted, a trinomial distribution is 
associated with each pair. Since the comparisons on the pairs are made 
independently, the data in a balanced experiment involving ¢ treatments 


constitute (;) independent trinomial distributions. 


We first determine the expected numbers in each of the categories. 
Let us suppose that a solution has been carried out, leading to final 
estimates of rt and the S; which we shall denote by 7” and S}%’ 
(¢ = 2, --- , t), with S{’ = 0. Using these values we may determine 
the corresponding expected values of a,; and a;,; , as a// and a4! , satisfy- 
ing the relations 
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HL +sin + SY — 61) 


and 
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= + sin — Si’ + (6.2) 
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Let the expected numbers be denoted by 


nt’,; = expected number of preferences for 7; , 


ni’,; = expected number of preferences for 7’; , 
and 


ni’ ,; = expected number of ties, 
when 7’; and 7’; are compared, where 
=n. 


The corresponding observed numbers have been represented in Section 
2 by 14.4; , 2.4; , and n,.,; respectively. It follows from the relations 
(2.5) that we may write 

nes = nay 

@.t7 

and 

= NA, 
from which we obtain 


ni = — aij), (6.3) 

ni; = n(l — afi), (6.4) 
and 

= + afi — 1). (6.5) 


If unequal numbers of observations were made on the pairs, n in the 
above expressions would be replaced by n,; . 

In testing the null hypothesis that the observed and expected 
numbers are in agreement we employ the chi-square statistic, which, 
for this case takes the form 


For sufficiently large values of the expected numbers, the quantity X? 
is distributed approximately as chi-square with degrees of freedom 
determined as follows. There are ‘ 
observations each. From the data we have estimated ¢ parameters, 


viz. r and ¢t — 1 values S; (¢ = 2, --- , ¢). Thus the degrees of freedom 
for X’ are 


) pairs which yield two independent 


2( —t= «tt — 2). 
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We note that with ¢ = 2 we can always fit the data perfectly, so there 
should be zero degrees of freedom as the formula indicates. 

The case in which r judges are involved and a separate 7 is assumed 
for each is handled in a similar way. The expected numbers are obtained 
for each judge separately. A separate chi-square test may be made for 
each judge, the degrees of freedom being f(t — 2) as above. If an 
over-all test is desired, the judges being considered independent, the 
separate sums of squares may be pooled over the r judges. In this case 
the degrees of freedom are 


—(r+t-—1). 


If large numbers of observations are made by each judge, one may 
set up chi-square tests for the validity of the assumption of a common 
r, and for the homogeneity of the judges with respect to the estimates 
of the S;. Let SS, denote the sum of squares of deviations between 
observed and expected numbers when based on a common 7” and 
separate S/’ values for each judge, and let SS, be the corresponding 
sum of squares based on separate 7/’ and separate S/’ . Then the 
difference SS,-SS, is distributed approximately as chi-square with 
r — 1 degrees of freedom, and tests the validity of the assumption of 
acommon rt. Let SS; denote the sum of squares of deviations between 
observed and expected numbers when based on separate 7/’ and common 
S’’. Then the difference SS,—SS, is distributed approximately as 
chi-square with (r — 1) (t — 1) degrees of freedom, and tests for homo- 
geneity of the judges with respect to the estimates of the S; . 


7. Computational Procedure with Numerical Illustration 


The computational steps in the analysis described in Sections 3 
and 4 are given below, and illustrated through application to experi- 
mental data. The data are taken from a recent paper by Fleckenstein, 
Freund, and Jackson [2], in which is described a paired-comparison 
experiment performed to determine the relative quality of five brands 
of carbon paper which are used by various departments of a company. 
For the purpose of reducing inventory costs the company wishes to 
standardize on the one or two brands of highest quality. Five secretaries 
were selected from each of six departments of the company. Fifteen 
arrangements of the ten possible pairs of brands were assigned to 
fifteen of these secretaries. The order of comparison in each pair was 
fixed. The remaining fifteen secretaries received the same fifteen 
random arrangements, but the pairs were in reverse order to those in 
the first group. Scores were allotted on a seven-point scale, which 
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included a no-preference class. For the purpose of applying our pro- 
cedure we assume that a common 7 is applicable and have pooled the 
data for each pair over the orders of presentation, yielding n = 30. 
In addition, we have pooled the mild, moderate, and strong preferences 
so that the resulting data for each pair are in a trinomial form consisting 
of preferences for one member or the other and no-preferences. These 
are presented in Table 7.1, in terms of the notation defined in Section 2. 


TABLE 7.1 
7 Data or NuMERICAL EXAMPLE 
. 1,2 8 5 17 
| 1,3 21 6 3 
; 1,4 4 2 24 
1 1,5 13 4 13 
i 2,3 18 4 s 
‘a 2,4 7 6 17 
2,5 16 6 
H 3,4 1 3 26 
i: 3,5 10 4 16 
7 4,5 23 3 4 
Unweighted Analysis 

1. From the given data we calculate a;; = (4.4; + m..:;)/n and 


Q;; = (n;.s; + ..:;)/n for each of the pairs. These may be represented 
conveniently in the form of a t X ¢ matrix, the a,;,; elements lying above 
the main diagonal, the a;; elements below. Let this matrix be denoted 
by A. Since no treatment is compared with itself (a,; is undefined), 
for convenience we shall define the main diagonal elements to be zero. 
For our example we have 


|. 0 0.733 0.300 0.867 0.567 | 
0.433 0 0.400 0.767 0.467 
A =|0.900 0.733 0 0.967 0.667 |. 
0.200 0.433 0.133 0 0.233 
0.567 0.733 0.467 0.867 0 | 


: 2. For each of the non-diagonal elements in the matrix A we de- 
a termined sin™’ (2a;; — 1) or sin™* (2a;; — 1) as the case may be. A 
table which enables this to be done conveniently has been provided by 


' 
| 


le- 
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Hald [5] (Table XID). The table gives 2 sin“'+/z in radians for « = 
0(.001)1. Since 


sin7' (2a;; — 1) = 2sin™' ~/a,; — 7/2, 


we simply look up 2 sin™'+/a,,; and subtract x/2 to obtain sin™ 
(2a,; — 1). Let us denote the matrix of inverse sine values correspond- 
ing to the elements of the matrix A by [sin™* (2a;; — 1)], and again 
define the main diagonal elements to be zero. In the example we have 
carried four places in intermediate computations, rounding off the final 
results to three places. The matrix [sin~' (2a;; — 1)] is found to be 


0 0.4848 —0.4115 0.8242 0.1344) 
—0.1344 0 —0.2014 0.5633 —0.0660 
0.9273 0.4848 0 1.2054 0.3405 |. 
—0.6435 —0.1344 —0.8242 0 —0.5633 
0.1344 0.4848 —0.0660 0.82122 0 | 


3. To obtain the G,; values as defined by (3.1), we add its own 
transpose to the matrix [sin~' (2a,;; — 1)] and divide the result by 2. 
Let us denote the resulting symmetric matrix by [G,;] which for the 
example is found to be 


TO 0.1752 0.2579 0.0904 0.1344 | 
0.1752 0 0.1417 0.2144 0.2994 
[G.;] = | 0.2579 0.1417 0 0.1906 0.1372 |. 
0.0904 0.2144 0.1906 0 0.1304 


(0.1344 0.2094 0.13872 0.1304 0 a 


4. To obtain the H;; values as defined by (3.2), we subtract its 
own transpose from the matrix [sin™* (2a;; — 1)] and divide the result 
by 2. Let us denote the resulting skew symmetric matrix by [H,;] 
which for the example is found to be 


0 0.3086 —0.6694 0.7338 — 0.0000 | 
—0.3086 0 —0.3431 0.3488 —0.2754 

[H.,]=| 0.6694 0.3431 0 1.0148 0.2032 |. 
—0.7338 —0.3488 —1.0148 0 —0.6938 

| 0.0000 0.2754 —0.2032 0.69388 0 J 
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5. The estimate r* is the mean of the G;; values of Step 3. For the 
example we have 7* = 0.1682. 


6. The vector X’Y given by (3.7) is the vector of row sums in 
[H,;] excluding the first. The vector of S* (¢ = 2, --- , t) with Sf = 0 
is obtained by substitution in (3.8), and for the example we find that 

St = —0.1895, S* = —0.6328, 
S* = 0.3715, S* = 0.0786. 
Weighted Analysis 


1. Using 7* and the S* as obtained above we compute a* and a* 
as defined in (4.3) and (4.4). For this purpose Hald’s Table XII may 
again be used, by reversing the procedure of Step 2 in the unweighted 
analysis. Let us denote by A* the matrix which has the a* above the 
main diagonal, the a*;,; below and zeros in the main diagonal. For the 
example we have 


0 0.675 0.399 0.859 0.545 | 
0.489 0 0.309 0.787 0.450 
A* = | 0.757 0.833 0 0.961 0.722 |. 
0.276 0.364 0.129 0 0.242 
| 0.622 0.711 0.438 0.885 0 


2. The r;; as defined in (4.5) are then calculated. From these, the 
weights v,; and w,; defined in (4.6) and (4.7) are obtained. For con- 
venience these may be written in the form of the matrices [v,;] and 
[w,;]. Since v;; and w,; are not defined, we take the diagonal elements 
in these matrices to be zero. The matrix [w,;] should not be confused 
with the diagonal matrix W of Section 4. [w,;] is merely a convenient 


way of arranging the weights for computation purposes. For the 
example we find that 


3.4412 3.2819 2.9087 3.4758 | 
3.4412 0 3.0266 3.2020 3.3887 
v.;] = | 3.2819 3.0266 0 2.0991 3.3647 
2.9087 3.2020 2.0991 0 2.7624 
| 3.4758 3.3887 3.3647 2.7624 0 


and 
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0 0.5850 0.5899 0.6038 0.5840 | 
0.5850 0 0.5989 0.5925 0.5865 
[w,;] = | 0.5899 0.5989 0 0.6563 0.5873 |. 
0.6038 0.5925 0.6563 0 0.6105 
0.5840 0.5865 0.5873 0.6105 0 


3. The estimate r**, given by (4.8), is the weighted arithmetic 
mean of the G,;; values with the v;; as weights, and for the example we 
find that 7** = 0.1688. 

4. We have written the symmetric matrix [w,;] in Step 2. Per- 
forming element-by-element multiplication with the matrix [H,,], 
we obtain the matrix of w;,H;; , denoted by [w,;H;,;], which for the 


example is 
0 0.1805 —0.3949 0.4431 0.0000 | 
—0.1805 0 —0.2055 0.2067 —0.1615 
0.3949 0.2055 0 0.6660 0.1193 }. 
—0.4431 —0.2067 —0.6660 0 —0.4236 
| 0.0000 0.1615 --0.1193 0.4236 0 


5. The (t — 1) X 1 vector of row sums of [w;,H;;] excluding the 
first is the vector X’WY given by (4.11). We then sum the elements 
: in each row of [w,;;] excluding the first, and enter the results as diagonal 
‘ elements in the corresponding rows. Upon deleting the first row and 
i first column from the result, and writing a minus sign in front of all 
. off-diagona! elements, we have X’WX as defined in (4.10). For the 
j example we have 


t —0.3408 | 
e 
xwy 1-3887 
—1.7394 
| 0.4658] 
and 


2.3629 —0.5989 —0.5925 —0.5865 | 
wx = — 0.5989 2.4324 —0.6563 —0.5873 
—0.5925 —0.6563 2.4631 —0.6105 


_—0.5865 —0.5873 —0.6105 2.3683 
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The solution B** is given by (4.9). In this case the inversion of X’‘WX 
was done with the aid of an IBM 650, the results being 


Il 


—0.1896, S?* = —0.6335, 


0.3710, S#* = 0.0784. 


Iterative Procedure 


Using 7** and S*¥* as starting points, we obtain an improved set of 
weights and carry out a new weighted analysis in the manner of Section 
4. For the example of this section a second weighted solution yielded 
results differing at the most by one unit in the fourth place from 
those given above. Four places were carried in the calculations, and 
upon finally rounding off our results to three places we have the 
values shown in Table 7.2. The term “first estimate” refers to the 
unweighted solution, second and third estimates being based on the 
weighted solutions. 


TABLE 7.2 
EstiMATES OF PARAMETERS IN NUMERICAL EXAMPLE 
Estimates 
Parameter First Second Third 
T 0.168 0.169 0.169 
S2 —0.190 —0.190 —0.190 
S3 0.372 0.371 0.371 
Sy —0.633 —0.634 —0.633 
Ss 0.079 0.078 0.078 


Variances and Covariances of the Estimates 


Denoting the final estimates of 7 and the S; by 7’ and the vector 
B”, as in Section 6, we may use the results given in Section 4 to obtain 
estimates of Var(r”’) and >-,-- , the variance-covariance matrix of 
B”. The estimate of Var(r’’) is obtained by multiplying the sum of 
the final v;; by 2n and dividing the result into 1. For the example this 
yields estimated Var(r”) = 0.0005. > °,, is obtained by dividing 
(X'WX)~" by 2n. For the example we find that 
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[0.0113 0.0057 0.0056 0.0057) 
wa 0.0057 0.0112 0.0057 0.0057 | 
10.0056 0.0057 0.0110 0.0057 
10.0057 0.0057 0.0057 0.0113. 


Goodness of Fit Test 


For the example of this section, the statistic X’ defined in Section 
6, and calculated using 7’ and B’’, is found to have the value 9.7582. 
Referring to the chi-square table, we find that for 15 degrees of freedom 


0.80 < P(X’) < 0.90, 


indicating good agreement between the observed and expected numbers. 


8. Summary and Discussion 


The Thurstone-Mosteller method of preference-ordering is extended 
to cover cases in which ties are admitted. This is accomplished by 
assuming that when the difference between a judge’s responses to two 
stimuli under comparison lies below a certain threshold a tie will be 
declared. This threshold and the mean stimulus responses are estimated 
by least squares. In order to overcome a difficulty presented by 
correlated data, at least in large samples, an angular response law is 
postulated for the response-scale differences. For every pair the no- 
preferences are added to the clear preferences expressed for each member 
of the pair. An inverse sine transformation is then applied to the 
corresponding proportions. The sums and differences of these trans- 
forms for each pair are used as the transformed data. 

For large samples the separate transforms have approximately a 
stable variance. However, in the variances of the sum and difference 
a covariance term enters, which causes non-homogeneity. A weighted 
least squares procedure is set up by first carrying out an unweighted 
analysis, from the results of which the covariance terms and hence 
appropriate weights are estimated. This procedure may be repeated 
in an iterative fashion, at each stage using weights based on the results 
of the previous stage. It has been found that unless there are a number 
of quite extreme comparisions involved, the unweighted analysis 
yields very good first approximations to the final solution. This is 
evident in the example of Section 7, as well as in a number of additional 
examples given in [3]. The possibility of determining the starting 
weights from the raw data has been considered in [3]. It is concluded 
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that the weights thus derived are likely to be less reliable than those 
obtained in the manner proposed. 

The problem of whether or not to permit ties in a paired-comparison 
experiment may hinge largely on psychophysical considerations. In 
dealing with a trained and experienced panel of judges, it would seem 
reasonable to permit a judge to declare a tie when unable to detect a 
real difference. To force a decision in all cases may lead the judge con- 
sciously or otherwise to substitute extraneous criteria of judgment 
when unable to make a decision based on the criteria of the experiment. 
Thus non-statistical biases may be introduced. On the other hand, if 
less reliable judges are permitted to declare ties, there may be a tendency 
to shirk decision. 

Situations arise in certain types of research involving paired com- 
parisons in which adequate provision for dealing with tied observations 
is virtually a necessity. An example is found in experiments which are 
carried out for the testing of uniformity of differences on ordinal scales, 
such as those used in color specification. The pairs of differences pre- 
sented to the judges are supposedly equal if the scale is properly con- 
structed. It would therefore seem reasonable to permit a competent 
judge to declare a tie when he cannot detect a difference. If the null 
hypothesis of equal differences is rejected, the estimation of the response 
scale values is necessary for the information of the experimenter in 
adjusting the metric of the scale. A method of carrying out this esti- 
mation for data in which ties are recorded is therefore required. 
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A STATISTICAL MODEL FOR DIAGNOSING ZYGOSIS BY 
RIDGE-COUNT 


Donavp L, RicHTER AND SEYMOUR GEISSER 


National Institute of Mental Health 
Bethesda, Maryland, U.S. A. 


INTRODUCTION 


Attention has been drawn recently by Smith & Penrose [1955] : 
and by Lamy et al. [1957] to the possibilities of using total ridge-count 
as a criterion for diagnosing the zygosis of twins. In this paper a | 
statistical model for the behavior of ridge-counts in multiple births is 
proposed and studied. 


MODEL FOR TWINS 
Let 


R, 


be a sequence of independent random variables such that FR; is dis- 
tributed as N(u;,o0w) where N(6,, 6.) denotes a random variable 
which is normally distributed with mean @, and variance 6, . Let yu, 
be an observation on a random variable M which is distributed as 
N(u, oj). Let xz, y be the observed ridge-counts on two persons of 
the same mother. Then z, y are interpreted as observations on R; , R; 
respectively. If 7 = j, the pair are monozygotic twins; if 7 # j, the 
pair are either siblings or dizygotic twins. Assume that oy , the within- 


egg variance, and oj , the between-egg variance, are constant for all 
mothers. Write 


p= (2-244) +(y- 244) = — 


where xz, y are the observed ridge-counts for a twin pair. 

Suppose now that z, y are each observations on R;. Then x — y 
is distributed as N(0, 20y). If x*(m) denotes a random variable which 
obeys the x*-distribution with m degrees of freedom, then we have at 
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once that for monozygotic twins ¢ is distributed as of x*(1). Before 
proceeding to the dizygotic case we recall the fact that if the con- 
ditional distribution of Z given a is that of N(a, 8) and if a is dis- 
tributed as N(0, 8’), then the (unconditional) distribution of Z is that 
of N(0O, 8 + 6’). Now suppose z, y are observations on R; , R; re- 
spectively, 7 ~ j. Then the conditional distribution of x — y given 
us — is that of N(u; — uw; , 20%). Then since — is distributed 
as N(O, 203), we have that x — y is distributed as N(0, 20,2 + 203). 
Hence for dizygotic twins ¢ is distributed as (0,7 + o3) x°(1). 

Lamy et al. [1957] give ridge-count data for 272 monozygotic 
and 185 like-sexed dizygotic twin pairs, and Smith & Penrose [1955] 
give data on 52 monozygotic pairs. For each twin pair we compute 
t = (1/2) (a — y)’ and then average over the 324 monozygotic pairs* 
and over the 185 dizygotic pairs to obtain estimates respectively of 
oy and oy + This gives est(oy) = 93, est(ey + = 1148, 
and so est(¢;) = 1050. 

We now have that for monozygotic twins ¢ is distributed as 93x7(1). 
Hence we have in the pooled monozygotic sample 324 independent 
observations on the same random variable ¢ and, if our model is to be 
valid, the frequency distribution of the observed ¢’s should fit our 
theoretical distribution. This is in fact the case as a chi-square goodness- 
of-fit test yields p > .50. For the dizygotic distribution an analogous 
test yields p > .15. Hence we conclude that the proposed model 
satisfactorily fits the available twin data. 

To employ the derived theoretical distributions in the diagnosis 
of twins, we first compute ¢ for the twin pair, and then find the corre- 
sponding ordinates in the two distributions. For monozygotic twins 
t is distributed like 93x°(1). Hence the density of ¢ is 


fu(t) = (1867¢ 
similarly for dizygotic twins the density of ¢ is 
fo(t) = 


For example, if we got a difference of 9 from a pair of twins, then 
t= 39° = 40.5 and ; 


fu(40.5) = .00523,  fp(40.5) = .00182. 


The relative probability, given monozygosis, of the observed result 
is .74 and similarly for dizygosis .26. In order to get the a posteriori 


*The two monozygotic samples may in fact be pooled since the ratio of the separate 
eatimates = 102/91 = 1.1, which is an F-ratio with 52,272 d.f. and yields p > .30. 
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probability of monozygosis, we must first know the relative a priori 
probability of monozygosis in the population being sampled. Suppose 
that the relative a priori probability of monozygosis is p; then the 
a posteriori probability may be found by the Bayes theorem to be 


P = p(.74)/[p(.74) + (1 — p)(.26)] 


and for dizygosis the a posteriori probability would be 1 — p. In fact 
the Bayes theorem may be used to combine relative probabilities 
obtained from several independent criteria, as was done by Smith & 
Penrose [1955]. It is of interest to point out that, if one criteria is 
conclusive, say it gives a relative probability of 0 for monozygosis 
(e.g. blood types are different for a twin pair), then the use of other 
criteria combined by the Bayes theorem will yield a posteriori prob- 
ability of 0 as it should. There is however a not insubstantial class of 
twins for which no conclusive method exists and it is thought that the 
use of ridge-count data in conjunction with other inconclusive criteria 
would be particularly useful for this group. 

The ratio of these ordinates or likelihoods (without regard to a 
priori probabilities) is the relative chance of dizygosis from the observed 
data. As an example we have computed these odds for differences 
of 9 and 20, the same examples chosen by Smith & Penrose [1955]. 
Similar odds may be computed from Table 10 in Lamy ef al. [1957]. 
For comparison, all are presented below, where the entry is the relative 
chance in favor of a dizygotic twin pair. 


Difference t Smith & Penrose Lamyetal. Richter & Geisser 
9 40.5 .26 .35 
20 200 ae .78 


EXTENSIONS OF THE TWIN MODEL 


The assumptions of our model lend themselves readily to the treat- 
ment of higher-order births. For triplets, let x, , x. , x3 represent the 
three observations, and their mean. Write t = > %_, (xz; — For 
monozygosis, the x; are each observations on the same R; and we have 
that ¢ is distributed as o,x’(2). For trizygosis three different R; are 
involved and we have that ¢ is distributed as (0; + o})x’(2). For 
dizygosis we take z, as an observation on R,; , 2 , 23 observations on 


$o% 
& 
| 
| 
| 
| 
Ts 


ZYGOSIS BY RIDGE-COUNT 113 
R;. Letting y = (x + 2;)/2, we have that 
3 
t= (a, — = — + — 2)’. 


It follows that ¢ is distributed as + 403/3)x7(1) + o#x°(1). 

Note that the parameters occurring in the triplet distributions are 
estimable from twin data. Also, ordinates of linear sums of independent 
chi-square variates may be computed with precision by differentiating 
the series given by Robbins & Pitman [1949]. After computing the 
three ordinates they may be normed so that they sum to unity and 
hence provide probabilities for each of the three sorts of zygosis. 

Quadruplets may arise from one, two, three or four eggs. Moreover, 
if dizygotic, they may be split 2, 2 or 1, 3. The distribution of 

4 


t= > (x, — 


t=1 


may be found for all five cases in a manner analogous to that above. 
The results are presented below. 


Case Distribution of ¢ 
monozygotic owx(3) 
dizygotic 2, 2 + 20%)x%(1) + x°(2) 
dizygotic 1, 3 + 30%/2)x°(1) + 
trizygotic (oy + o%)x*(1) + + 30%/2)x°(1) + 
tetrazygotic + o%)x(3) 


In conclusion we remark that our results for ridge-count are equally 


applicable to any other diagnostic criterion which satisfies our initial 
assumptions. 


SUMMARY AND REMARKS 


We assume that ridge-count is normally distributed with a constant 
within-egg variance and a constant between-egg variance. The impli- 
cations of these assumptions are explored for twins and it is shown 
that our model accounts fairly well for variations in available twin data. 

The assumption that is possibly weakest in this model is that the 
variation between egg means within a mother is the same for all mothers. 
The fact that the dizygotic pairs did not fit a chi-square distribution 
quite as well (.20 > p > .15) as the monozygotic pairs indicates that 
the variation between egg means within a mother is possibly not quite 
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the same from mother to mother. However, since we had a fairly large 
number of dizygotic pairs and p was still far from significant, this 
variation between mother variances is probably insignificant in 
relation to the other variances. Hence the results as given should be 
useful for diagnosing zygosis in plural births. 

The authors are grateful to Dr. Gordon Allen and Dr. Samuel W. 


Greenhouse, both of National Institute of Mental Health, for stimu- 
lating discussion. 
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143 NOTE ON A 5 X 2? FACTORIAL DESIGN' 


B. V. Suan 
University of Bombay, Bombay, India 


Li [2] has given a 5 X 2” design in 10 blocks of 10 plots each. In 
the design given by Li the loss of information on the interaction ABC 
is not uniformly distributed over the four degrees of freedom, so the 
author has suggested an alternative design which is balanced over each 
of the interactions; consequently the analysis of this design is simpler 
than that of Li’s design. 

In 5 X 2° factorial experiment in three factors A, B, and C at 5, 
2, and 2 levels respectively, the treatment combinations can be denoted 
by (zjk), = 0,1, , 4; 9, = 0,1. If denotes combinations 00 
and 11 and 6 denotes combinations 01. and 10 of factors B and C, then 
the design given by Li [2] can be written as in Table 1. 

The method of constructing such designs is described in Kempthorne 
[1], Chapter 18. The whole point is to distribute a’s and #’s over 


TABLE 1 
Tue Pian or Li’s 


Replications 1 2 3 4 5 


Blocks 1 y | 2 1 2 1 3 1 2 


Oa | OB | 08 | Oa | OB | Oa | OB | Ow | Oa | 0B 
la 18 le 1g 18 le 1B le 1g la 
Treatments 28 | 2a | 2a | 28 | 2a | 28 | 26 | 2a | 28 | 2a 
38 | 3a | 38 | 3a | 3a | 38 | 3a | 368 | 36 | 3a 
48 4a 48 4a 48 4a 4a 48 4a 48 


1This work was supported by a Research Training Scholarship of the Government of India. 
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the levels of the 5-level factor so that each block contains 2 a’s and 
3 B’s or 3 a’s and 2 Li’s design which is partially balanced is obtained 
by writing a column of {0, 1, 2, 3, 4}, then putting in 2 a’s and 3 f’s 
for one block and then making up the complementary block which 
contains 3 a’s and 2 6’s. To get complete balance by this method we 
would require 10 replicates (the number of ways of picking 2 boxes 
out of 5). Instead, if we take from 5 of the resulting replicates only 
those blocks with 2 a’s and 3 6’s and from the other 5 replicates only 
those blocks with 3 a’s and 2 ’s, then we obtain a balanced design as 
given in Table 2. 


TABLE 2 
PLANE OF THE DESIGN 


Blocks 1 2 3 4 5 6 A 8 9 10 


Treatments 28 | 2a | 2a | 28 | 28 28 | 2a | 28 | 2a | 2a 


The above design is not resolvable but it indicates the fact that, if 
there is no need to arrange the experiment by replicates, it is possible 
at times to cut down the total amount of replication but still retain the 
balance. 

The author [3] has given a set of necessary and sufficient conditions 
for balancing in a factorial experiment. It can be verified that the 
above design satisfies those conditions and is therefore balanced. 

The analysis of this design is as follows: Let a,b,c, denote the 
total yield of the treatment combination (ijk), i = 0, 1, --- , 4; J, 
k = 0,1. Further let 


1 1 4 
a,b; = a,b, , ac, = >, a,b,c, , bic. = >, ; 
k=0 


i=0 i=0 
1 1 4 1 
a; = > a;bicy , b; = , 
7=0 k=0 i=0 k=0 
4 1 4 1 1 
i=0 j=0 k=0 


Let J, be equal to + axbic: — axboc: — axbico), + — twice 
the sum of the total yield of the blocks in which the treatment combi- 
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0B | 08 | OB | Oa | OB | Oa | Oa | OB | Oa 
le 18 18 1p la 18 la la 
; 3B 38 3a 3a 38 3a 38 3a 38 3a 
| 4g 48 48 4a 4a 4a | 4a | 48 | 4a | 48 
th 
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TABLE 3 
ANALYsIS OF VARIANCE 
Effects or 
Interactions d.f Sum of squares 

A 4 (> ; ai/20?) — G2/100 
B 1 03/50) — G2/100 
1 (Xie ck/50) — G*/100 
AB 4 [Xs (aibi — asbo)®/20] — — 
AC 4 (asex — aieo)*/20] — — ¢0)*/100 i 
BC 1 72/2400 
ABC 4 1i/380) — 12/1900 


nation (400) occurs.] Let J = pom I, . Then the sums of squares 
due to various effects and interactions are as given in Table 3. 

The loss of information on the interaction BC is 1/25 and that on 
each degree of freedom of the interaction ABC is 6/25. 

To obtain the estimates of individual treatment effects, define 


Q(ijk) = a,b;c, — (1/10) times the total yield of the blocks in which 
p the treatment combination (jk) occurs. Let t(ijk) be the effect of the 
treatment combination (ijk) and let é(¢jk) be the estimate of t(ijk). 
s Then 
e 2Q + J; 
i211) $Qin1 + J; 
= 2Qin — J; , 
£(710) = $Qi10 J; 
where 
3 5 
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144 QUERY: On Combining Partial Correlation Coefficients 


Standard textbooks of statistics (Snedecor and others) show how 
several simple correlation coefficients can be combined into one average 
value by the z transformation. Suppose these had been partial correl- 
ation coefficients independent of two other common variables; can 
these be likewise combined into one value? 


Under the usual assumptions in respect of normal varia- 
ANSWER: tion, the sampling distribution of a partial correlation 

coefficient (calculated from sums of squares and products 
of deviations) is exactly the same as that of a simple correlation co- 
efficient, except for the loss of 1 degree of freedom for each variate 
“partialled out.” I take the enquirer’s phrase “independent of two 
other common variables” to mean that he refers to partial correlation 
coefficients of y, and y, with y; , y, held constant. The answer to his 
question is then “Yes, provided that, in all calculations of the weighted 
mean 2, its variance, and the bias, the values of n (the numbers of sets 
of observations contributing to the several values of 7) be replaced by 
(n — 2).” A fairly obvious consequence is that no correlation coefficient 
based on 5 or less entries gives usable information. Of course this 
generalizes to coefficients with h variates held constant by using (n — h) 
in place of n. 

Snedecor gives an example concerning correlations between initial 
weight and gain in live weight for steers of three breeds in two years, 
6 values of r. Except for replacement of n by (n — 2), and the con- 
sequent omission of one series for which n = 4, the form of calculation 
would have been the same if he had had partial correlation coefficients 
with food intake and initial age held constant. 

The essential condition that must be fulfilled before this method of 
averaging correlation coefficients is used is that corresponding sums of 
squares (and products) from which every pair of different values of r 
are computed should arise from orthogonal comparisons among the 
data; this will be fulfilled if, as in Snedecor’s example, they are computed 
from measurements on different individuals, and more generally if they 
come from different independent lines of a table of analysis of variance 
and covariance. 
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145 NOTE: ON A TEST FOR ORDER 


J. B. CHAssaNn 
Saint Elizabeths Hospital, Washington, D. C. 


Suppose we are dealing with a set of observed relative frequencies, 
u,/n; ,i = 1, 2, +--+ , N from N binomial distributions and we wish 
to test the null hypothesis p, = p. = --- = py. It is well known that 
the hypothesis can be tested by the use of the standard chi-square 
contingency test with N — 1 df. A statistically significant chi-square 
value is then interpreted as resulting in the rejection of the null hy- 
pothesis. Such a test is most directly appropriate if the description of 
the particular set of binomial distributions and our state of knowledge 
concerning them are such that we have no a priori reason to expcet 
any one ordering of the p,’s in preference to any other. [Tor example, 
in the case of N = 2, this is analogous to the use of a two-tailed test, 
as one might apply when testing two active drugs against each other. 
If, on the other hand, our situation were such that the null hypothesis 
could be rejected only on the basis of a particular ordering of the p,’s, 
then the contingency test can be strengthened. Again, for N = 2 
(this could be the case of an active drug against a placebo), the x’-test 
is strengthened by the familiar device of the use of a single-tailed test, 
the P-value obtained from the standard (two-tailed) contingency test 
being halved. 

For N > 2 one approach is to strengthen the chi-square test by 
testing for a linear regression on p (see [1]). A limitation of the regression 
approach as Cochran notes may occur when judgment has to be used 
in the selection of a metric for the independent or conditional variable. 
Although such a test can be valid, it does seem worth considering the 
possibility of using other tests which are independent of the particular 
element of judgment involved in transforming an ordinal scale to a 
metric one. 

The test proposed here is a simple extension of the procedure of 
using a single-tailed test for N = 2, and is first limited to binomial 
samples of equal size, i.e., nm, = Ng = = Ny. 

This corresponds to a 2 X NW contingency table in which the column 
sums are all equal. Then corresponding to any such observed table of 
frequencies (and such that xz; ¥ 2; for all pairs of 7 ¥ j), there will be 
N! possible permutations of the columns, all yielding the same  ’-test 
value. Next, it is noted that under the null hypothesis each such 
permutation has the same probability. Then we can say that the 
probability that the standard x’ contingency statistic exceeds a value 
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k, corresponding to a significance level a, is comprised of component 
probabilities corresponding to two patterns of frequency matrices. 
The first is the case for which x; # x; for all possible z ¥ j, and the 
second is that for which x; = 2; for one or more pairs of i ~ j7. We can 
label the probabilities for the respective totals under the null hypothesis 
as y and « respectively, and the level of significance corresponding to 
x’ > k may then be written as a = y + «. Next, the probability that 
a specified order, say x,/n > x,/n > --- > 2y/n will produce a x’ > k 
will then be y/N!. Since y/N! < a/N!, we may use a/N! as the desired 
level of significance. 

In summary, given N binomial samples with observed relative 
frequencies z,/n, > > > , such that n, = nz = = 
my , and a standard chi-square contingency test value corresponding to 
a level of significance, a, then the probability that the particular observed 
ordering under the null hypothesis, was due to chance cannot exceed 
a/N}. 

This modification of the contingency test obviously does not in 
general apply if there are differences among the n; . The reason being 
that the relative frequencies among the smaller samples have a greater 
probability of appearing at the extremes than those based on the larger 
n’s and consequently one cannot assume N! equally probable arrange- 
ments of relative frequencies. Conversely, however, certain arrange- 
ments of sample sizes will allow the use of a/N! as an upper bound a 
fortiori, for example, with respect to 2,/n, > 22/n. > 23/n3 when 
Ne <M, N3. 

The same kind of test can obviously be applied to the analysis of 
variance. Thus, let each column of the n X k matrix [x,;] represent 
a sample of n observations from each of k respective homoscedastic 
normal distributions. The significance level, a(F), of the usual analysis 
of variance test of the null hypothesis m, = m, = --- = m, is obtained 


from the statistic 
P= = 2.) 1) 
(es — — 1) 
with (k — 1) and k(n — 1) df. This test is generally used against 
unspecified alternatives. 

Suppose, however, that one wishes to test against the alternative 
hypothesis m, > m, > --- > m,, and one actually obtains the result 
21 > > Then the significance level can be strengthened 
to a(F)/k!. This follows from the consideration that under the null 


hypothesis the k! possible permutations of the columns of any observed 
{x;;] are equally probable. 
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146 QUERY: Partitioning the ‘‘Between Slopes” Sum of Squares for 
Forest Growth Data 


In a study of the growth of spruce trees diameter (x) and height 
(y) were recorded for each of 28 trees in each of 4 spacings. Covariance 
analysis yielded the following: 


Residuals 


Spacing |d.f. SS. SCP SS, d.f. SS,2 M.S. F 


bys 

4X4 27 8.27 33.08 198.75 4.000 | 26 66.43 

4x8 27 7.89 25.91 126.53 3.284] 26 41.44 

4X 12 27 10.25 25.19 107.72 2.458 26 45.81 

4X 16 27 10.55 20.76 87.25 1.968} 26 46.40 

Within 104 200.08 1.92 
Common | 108 36.96 104:94 520.25 2.839 | 107 222.30 
Between 

slopes 3 22.22 7.41 3.86* 


There is apparently a patterned change of slope with increasing 
spacing, and it would be of interest to partition the ‘‘between slopes” 
sum of squares into linear, quadratic, and cubic components. 

I know that the “between slopes” sum of squares may be calcu- 
lated by >>3., (b; — bo)? SS., where b; and SS,, are, respectively, the 
regression coefficient and the sum of squares in the 7th spacing, and 
by is the common regression coefficient, but I have not been able to use 
this information in devising a plan for the partitioning. 


The problem is that of constructing orthogonal com- 
ANSWER: ponents for unequally weighted data since we know from 
regression theory that Var(b,.,) = o°/SS,, . Wishart 
and Metakides (Biometrika 40: 361) give a detailed account of a sys- 
tematic approach to the problem. In this instance we require only 
the relevant partitioning of the sum of squares between slopes: we 
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omit calculation of the actual polynomials, except for the dominant 
linear term, and of the variances of the regression coefficients. 


Taking the spacings as the 2; : 1 2 3 4 
regression coefficients are b;: 4.000 3.284 2.458 1.968 
with weights w;,;: 8.27 7.89 10.25 10.55 
Using the notation of the above paper, required calculations are 

Soo => ww) = 36.9, 
Sio = So => wr) = 97.00, 
Soo = Sir = Soo = = 300.88, 
Sso = Sis = Sn = Su = >> (wet) = 1023.34, 
= = = (wz) = 3665.56, 
= 8x = >> (wz?) = 13554.70, 
S33 = = 51198.28, 
Syo = (w,b;)) = 104.94766, 
Sy = = 243.53462, 
Sy = =  695.67194, 
Sys = = 2249.41118, 
= = 320.199. 


These figures are now entered in a table as described by Wishart 
and Metakides which enables the required partial sums 


Sn S,0S10/Soo (p 2, 3, y); 
Sy2.01 = Sy2.0 — 5 (p = 2,3, y), 
S,3.02 = v3.01 — » (p = 3, 


to be calculated systematically. Corresponding regression coefficients, 


byo = Syo/Soo ; representing the mean, 
= Sy.0/Si1.0 ; representing the linear component, 
= 5 respresenting the quadratic component, 


Bys.o12 = Sy3.012/Ss3.012 representing the cubic component, 


are calculated in the same diagram, and their associated sums of squares 
are obtained in the usual way (e.g., for the cubic component the SS. is 
bys.c12Sy3.012)- 

The completed calculation is as follows: 
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Thus the linear term contributes virtually all the between slopes sum 
of squares, and the complete analysis of variance of the residual data 
may be constructed: 


Source d. f. 8. 8. M.S. 
Linear difference in spacing 1 21.97 
Quadratic difference in spacing 1 0.14 
Cubic difference in spacing 1 0.09 . 
Between spacings 3 22.22 7.41 
Within spacings 104 200 .08 1.92 
107 222.30 


The linear relationship between regression coefficients and spacings 
is, in the notation of Wishart and Metakides, 
b = — + 
which on our scale becomes 


b = 4.647 — 0.6888 z. 


In this instance the weights of the four slopes are not grossly 
different, and a standard unweighted linear regression procedure might 
be considered an adequate approximation. We obtain the regression 
equation b = 4.658 — 0.6922 x with corresponding analysis of variance: 


d. f. 8. S. 
Linear difference in spacing 1 22.14 
Remainder 2 0.15 
Between spacings 3 22.29 


in which the S.S. as constructed in the usual way have been multiplied 
by 3 >> w, to make the table comparable with that for the weighted 
regression above. 

Wishart and Metakides give a comprehensive list of references to 
earlier work on the mathematical and on the computational aspects of 
fitting orthogonal polynomials. 

R. M. Cormack 
University of Aberdeen 
Aberdeen, Scotland 
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147 NOTE: Tables for W. L. Steven’s ‘‘Asymptotic Regression” 


H. Linwart 


National Institute for Personnel Research 
South African Council for Industrial and Scientific Research 
Johannesburg, South Africa 


Stevens [1951] considered the estimation of parameters in an ex- 
ponential model. We have recently prepared a report [1959] that 
contains some tables which could be of use in applications of Stevens’ 
method. The tables give (Stevens’ notation) Fi. , Fas, Far , Fos, Fo ; 
F,, for n = 11 [r = .74(.01).99], n = 20 [r = .80(.01).99], and n = 40 
[r = .20(.05).55, .58(.02).64(.01).99]. Not more than 6 decimals are 
given, for n = 11 up to7 figures and for n = 20, n = 40 up to 8 figures. 
A few copies of the report may be obtained from the National Institute 


for Personnel Research, P. O. Box 10319, Johannesburg, Union of South 
Africa. 
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ABSTRACTS 


The following papers were presented at the meeting of the Western North American 
Region which was held in San Diego, June 15 and 16, 1959. 


W. RAY BRYAN (National Cancer Institute, Bethesda, Mary- 
629 land). Some factors in the interpretation of Tumor Response 
in Theoretical Studies of Virus and Chemical Carcinogens. 


The observable biological responses to viral and chemical carcinogens 
are not immediate reactions to these agents, but amplifications of the 
initial agent-host interactions brought about, after a period of time, 
by the growth of altered cells of localized areas into colonies of de- 
tectable size. Therefore, within a given animal, the development of a 
tumor response is a problem in cell population behavior. The measurable 
parameters of the tumor response are greatly influenced by host 
factors, and considerable heterogeneity is frequently encountered 
among different animals of a common lot, even in the best of available 
test-animal populations. Some biological factors associated with this 
heterogeneity were discussed and some newer biometric methods 
(described by P. Armitage) for dealing with the heterogeneous data 
were reviewed. The appropriate interpretation of tumor-response data 
is of importance not only for the practical problem of bio-assay, but 
also in examinations of biological data for consistency with math- 
ematical theory. 


G. E. DICKERSON AND K. GOODWIN (Kimber Farms, 
630 Inc. Niles, California). Sampling Errors of Alternative Estimates 
of Genetic Responses to Selection. 


The paper compares sampling errors of four types of estimates for 
genetic response to selection. These are: 1. Repeated intrayear com- 
parisons of successive generations. 2. Comparison of gross and environ- 
mental changes in the same population. 3. Comparison of gross and 
environmental changes of independent populations. 4. Comparison 
of gross changes in two independent populations. Methods 1, 2 and 3 
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refer to use of a special control strain in which the same generation is 
reproduced from parents of the same age in two successive years. 
Method 4 refers to either comparison of two populations, each under 
a different type of selection, or to use of a control strain in which it is 
assumed that there is no genetic change over an indefinite period of 
years. Method 1 and 3 appeared to give similar precision of estimation 
from a given scale of experiment. Method 2 is noticeably more efficient. 
Method 4 provides still more precise estimates, but requires the assump- 
tion of negligible genetic change in the control strain over an indefinite 
number of years, whereas Method 1, 2 and 3 do not require this assump- 
tion. 


W. J. DIXION (University of California, Los Angeles, California). 


os Estimation in Short up-and-down Sensitivity Trials. 


Estimates of LD;, with almost constant variance are obtained 
exactly through maximum likelihood estimation procedures for up- 
and-down sensitivity trials for sequentially determined sample sizes of 
minimum size 2 (1) 6 and for larger minimum lengths approximately. 
An example of the use of these estimates in a factorial design is presented. 


J. L. HODGES, JR. (University of California, Berkeley, Calif- 


- fornia). A Two Stage Sequential Design for Bio-Assay. 


Because trials are informative only when the dose is near the fifty- 
percent response point yu, the bio-assay situation can benefit from 
sequential design to an unusual degree; but the long response time will 
characteristicaliy make a completely sequential design not feasable. 
It seems reasonable in these circumstances to consider designs carried 
out in a few stages. The present paper presents a two-stage design 
for estimating or testing u, whose most attractive feature is the ease of 
analysis. In the first stage, one trial is performed at each of a number 
of equally spaced levels, while in the second stage the trials are concen- 
trated at two levels chosen on the basis of the results from the first stage. 


B. J. HOYLE AND G. A. BAKER (University of Cali- 
fornia, Tulelake Field Station and University of California, 
Davis, California). Factor Analysis of Twenty-eight Independent 
Field Yield Trials on Nine Strains of Hannchen Barley. 


It has been shown with numerous uniformity triais of other workers 
that the responses of plants to small local environments are quite 
variable. This variation is evident no matter how small and apparently 
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uniform the soil area is made as indicated in there studies. It is also 
true that there is usually some tendency for plants on small plots close 
together to respond somewhat similarly. However, no practical way of 
picking small plots in advance of growing that will produce the same 
plant response has as yet been devised. Certainly mere nearness is 
not a sufficient criterion for equality of response. 

The common, conventional, inflexible, field designs arbitrarily 
separate ihe variation inherent in field trials into parts, one of which 
is labeled experimental error. This experimental error, so called, is 
fallaciousiy taken as a measure of the stability of the relative yielding 
abilities of the varieties (or treatments) under investigation. In order 
to keep the experimental error small so that differences can be called 
significant it is usually recommended that the soil and other conditions 
be made as uniform as possible. 

The present experiments indicate that designs can be quite simple 
and flexible and that the resulting estimates of the relative yielding 
abilities are far more stable than has been generally believed. Also, 
it is not necessary nor even desirable to have all conditions unreal- 
istically uniform. 

The relative yielding abilities were quite stable over a wide range 
of local environments. Thus, realizing the island-like variation in 
plant response it is possible to design and interpret field trials in such 
a way as to save time and money and increase confidence in our findings. 


RAYMOND J. JESSEN (General Analysis Corporation, Los 
634 Angeles, California). Objective Procedure for Estimating the 
Florida Orange Crop. 


A description will be given of a procedure for obtaining a sample 
of trees and fruits for estimating the total quantity of oranges on the 
trees prior to harvest and for forecasting the crop to be picked. As 
part of the procedure, an estimate was made of the total number of 
bearing trees by harvest class (early and mid-season and late) and, 
since the survey was made, final tabulations of a tree census are available 
for a comparison of the survey’s accuracy. The method of obtaining 
a randomized sample of fruits on a tree is described and also the method 
of measuring the brix and other quality characteristics. The survey 
was carried out during 3 seasons. 


H. A. KORNBERG (General Electric Company, Richland, 
Washington). Radiostrontium-Calcium Relationships in Plants 
and Animals. 


Three criteria for the validity of application of the OR. (observed 
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ration), which is in widespread use in radiobiological investigations 
involving element pairs are established. The specific elements con- 
sidered in this paper are calcium and strontium-90. From mathematical 
considerations, and from laboratory investigations, it is shown that 
one or more of these criteria may fail in many investigations. The 
constant OR is replaced by the more general DF (discrimination factor) 
in which the DF is allowed to be a function of both time and concen- 
tration of dietary calcium. A hypothesis is proposed which is capable 
of providing a qualitative explanation of the variability of the DF with 
increasing concentration dietary calcium. More comprehensively, a 
mathematical model is constructed in an attempt to encompass both 
time and dietary calcium in explaining variable discrimination of a 
biosystem against one of the elements, strontium-90 and calcium. 
Specifically, a multicompartment model is assumed, and a system of 
differential equations for the time dependent amounts of calcium 
and strontium-90 present in the various compartments is derived. 


63 ROBERT H. RIFFENBURGH (University of Hawaii, Honolulu, 
Hawaii). Parameter Estimation for Fish Growth Curves. 


Biologists have long been troubled by the lack of estimators for the 
parameters involved in growth curves, especially the Gompertz curve 
(for which maximum likelihood estimates do not exist). This paper 
presents methods for estimating parameters of the Gompertz curve and 
other curves for the conditions: (a) when the sample consists of a set 
of size measurements on the same animal or population with the time 
between each measurement known, and (b) when the sample consists 
of a set of pairs of size measurements, each pair on a different animal, 
and a fixed time elapsing between the first and second members of the 
pair, as in capture-tag-recapture data for fish. Illustrations of each 
technique are given. 


637 CRAWFORD F. SAMS (University of California, Berkeley, 
California). Some Effects of Radiation on Aging. 


A study has been made of the processes involved in early aging and 
shortened life span as the result of exposure to external ionizing radiation. 
A number of investigators have studied the causes of death in irradiated 
animals in relation to age at time of death and have compared these 
results with the distribution of causes of death in relation to age in 
control groups. This work has led to the statement that radiation 
induced early aging and shortened life span apparently is an acceler- 
ation of the normal aging processes. It is a compression of normal 
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processes in a shortened time interval rather than the deletion of a 
specific segment of the normal life span. There is much evidence to 
support this view. 

Certain correlations exist in the normal aging process. The cardio- 
vasculorenal disease complex is the one degenerative disease showing 
a straight line correlation with increased age. There is a correlation 
between the generalized systemic process, atherosclerosis, and the 
cardiovasculorenal disease complex. 

There are correlations between certain metabolic factors and the 
development of artherosclerosis with increased age. 

One striking difference is noted in all of the correlations and that is 
the sex difference. It is this striking sex difference that led to the next 
logical study, that is, the central nervous system-endocrine gland 
system, which controls body metabolic processes. 

The current knowledge available on radiation induced effects, as 
they pertain to aging, has shown identical correlations with those 
found in normal aging, but they are accelerated in point of time. 

The fact that correlations exist does not necessarily prove a cause 
and effect relationship. A cause and effect relationship has been postu- 
lated. Whether or not this postulation is valid can be determined only 
by a carefully controlled key experiment, which is now under way. If 
the sequence of events and their cause and effect relationships are 
found to be valid, then a way is open to modify the course of radiation 
induced early aging as a delayed effect of radiation. 


M. B. SHIMKIN (National Cancer Institute, Bethesda, Mary- 
638 land). Some Quantitative Aspects of Induction of Lung Tumors 
in Mice. 


Quantitative histologic observations were made of lungs of strain 
A mice killed from 1 to 133 days following a single intraperitoneal 
injection of 1 mg. of urethan per gram body weight . 

Within a day, and reaching a peak at approximately 4 weeks, there 
is a generalized increase in the number of nucleated cells in the lung. 
By 3 weeks there appear circumscribable hyperplastic foci, and by 5 
weeks there are approximately 600 such foci per animal. The number 
of hyperplastic foci then gradually decreases. 

The first pulmonary tumors are recognized in sections taken 3 
weeks after urethan. The number increases rapidly to 36 per lung at 
7 weeks, and then remains at that level. The mean size of the tumors 
increases rapidly. The mean doubling time of the larger tumors is 
4.4 days up to 7 weeks, after which the relative rate decreases to a 
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mean doubling time of 55.5 days between 12 to 19 weeks. The absolute 
rate of change in the geometric mean of the volume, however, remains 
fairly constant between 7 and 19 weeks. At 19 weeks, pulmonary 
tumors occupy only 1 percent of the lung volume and represent 2.5 
percent of the total cell population of the lung. (J. Nat. Cancer Inst., 
16: 75-93, 1955). 

Quantitative measurements were carried out on the growth of 
induced primary pulmonary tumors in mice of strains A and C3H. 
The growth rate characteristic of the tumors appears to be a property 
of the tumor itself and particularly dependent upon the size of the 
mass. The growth rate of pulmonary tumors is more rapid in strain 
A than in strain C3H mice by a ratio of 1 to 0.6. The growth is mani- 
fested by a rapid initial phase with gradual deceleration. It is suggested 
that this deceleration may be due in part to a negative gradient of the 
rate of cell division from the surface toward the center of the tumor 
mass. (J. Nat. Cancer Inst. 21: 595-610, 1958). 


JOHN E. WALSH (System Development Corporation, Santa 
639 Monica, California). Computer-Feasible General Method for 
Fitting and using Regression Function When Data Incomplete. 


The data ordinarily used for determining the regression function of 
a dependent variable on specified independent variables consist of a 
number of multivariate observations, where each observation contains 
values for the independent variables and the corresponding value for 
| the dependent variable. However, in many situations involving 
| biological, medical, and other types of data, some of the values for the 
variables are missing. This can happen among the observations used 
in determining the regression function and also in the application of the 
regression function for estimation purposes. This paper presents a 
generally applicable method for handling these two problems. The 
underlying procedure involves the estimation of the missing value for 


B an independent variable from its regression function on the independent 
4 variables with known values. A scheme which is feasible for applica- 
D tion on a high-speed computer is developed for determining the huge 
r number of regression relations that can arise among subsets of the 

independent variables. This scheme consists in first establishing a 
3 basic set of regression relations among sufficiently small subsets of the 
vt independent variables and then determining the remaining regression 
rs relations in terms of appropriately weighted sums of functions in the 
is basic relations. Also, for cases where the forms specified for the re- 


a gression functions are linear in unknown constants, a special type of 
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least-squares curve fitting technique is developed for determining the 
constants on the basis of incomplete data. 


A. D. WIGGINS (General Electric Company, Richland, Washing- 
640 ton). A Life Testing Problem Arising from Exposure to Radio- 
active Waste. 


A mathematical model of a censored life-testing experiment is 
constructed using the times of death as random variables. A failure 
rate which is a step function of time is assumed, with the result that 
the survival probability is a modified form of exponential law. For 
a test of the composite null hypothesis that the control and experimental 
groups have the same failure rate in specified intervals in which the 
failure rates do not change, against a one-sided composite alternative, 
it is shown that there exist similar regions, conditioned on fixed numbers 
of failures within the specified intervals. The form of the similar 
regions and the test criterion are derived and the characteristic function 
of the test criterion is exhibited. An application to a population of 
fish exposed to radioactive effluents is considered. 1 


The following papers were presented at the meeting of the Eastern North American 
Region which was held at Pennsylvania State College, State College, Penn. in conjunction 
with meetings of the American Institute of Biological Sciences, August 31—September 
8, 1959. 


EDWIN L. COX (Biometrical Services, ARS, Department of 
641 Agriculture, Beltsville, Maryland). May Some Data be Dis- 
carded? 


Statisticians as well as research workers have observed experimental 
data which seemed to be contaminated with extreme values or outliers. 
While there is an extensive statistical literature presenting criteria for 
making decisions concerning such data, there is no uniformity of practice 
among either statisticians or research workers. The statistical literature 
is reviewed and its application to specific problems shown by the use 
of examples. 


W. T. FEDERER (Cornell University, Ithaca, New York). 


Experimental Error Rates. 


Three bases for setting a per cent error rates in experimental work 
are discussed and illustrated. The bases are per comparison or com- 
parisonwise, per experiment, and experimentwise. In addition, a dis- 
cussion of confidence and significance statements is given along with 
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procedures for obtaining l-a per cent sample confidence intervals on 
different error-rate bases. The procedure for dividing all l-a per cent 
sample confidence intervals into two groups, viz. those less than or 
equal to a specified number, say 24, , and those greater than 26, , and 
for obtaining the expected proportion of sample intervals in each of 
the two groups is described and illustrated. 


W. R. HARVEY (USDA, ARS, Biometrical Services, Beltsville, 
64. Maryland). Analysis of Data with Unequal Subclass Numbers 


when a Set of Orthogonal Comparisons Among the Subclass 
Means is Desired. 


The matrix inverse to the least squares variance-covariance matrix 
can be easily obtained when (i) every subclass contains at least one 
observation and (ii) a set of orthogonal comparisons is desired among 
the subclass means. When these two conditions exist, the constants 
to be fitted can be expressed in terms of the subclass means. The 
functional relationship between the constants and the subclass means 
provides a matrix of coefficients which is used to weight the reciprocals 
of the subclass numbers to obtain the inverse elements. For example, 
in the two-way classification the complex model is y;;, = u + a + 
b; + (ab);; + ej, and the simple model is y;;, = 8;; + esi, where 
8; = uta; +b; + (ab);;. The coefficients of the least squares normal 
equations for the simple model form a diagonal matrix, D, and the inverse 
of the coefficient or variance-covariance matrix for the complex model 
(C) is obtained from KD~'K’, where K is the matrix of coefficients 
from the functional relationship of the s;; and the constants yp, a,, b;, 
and (ab),;; . The restriction is imposed that 


La = Lb = = = 0. 


Application of this procedure to the analysis of a set of cross line data 
is presented. 


WALTER C. JACOB (University of Illinois, Urbana, Illinois). 
Interpretation of Experimental Results. 


The basis of inference is actually built into the experiment by the 
design characteristics. All sources of variation which are randomized 
into the error term are these bases of inference and must be recognized 
when drawing conclusions. 

In single-factor experiments the major problem of interpretation 
lies in comparing the means of treatments. Use of Duncan’s multiple 
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range test is suggested for general situations with no built-in com- 
parisons but individual degrees of freedom represent the best approach 
where possible. 

With factorial experiments the problem of interaction interpretation 
is added to those problems of single-factor tests. A simple approach to 
interaction interpretation is the use of a graphical display of the points 
so that the source of significance can be determined. 

It is emphasized that the interpretation of experimental results 
must be in terms of the subject matter field and not in terms of statistical 
tests of significance. 


E. JAMES KOCH (United States Department of Agriculture, 


soa Beltsville, Maryland). Presentation of Experimental Results. 


Many research workers have had difficulty in deciding on the best 
method of presenting the results of a statistical interpretation of a set 
data. In this paper, suggestions are given as to which results of the 
experiment to present and several possible methods of presenting the 
same results are given. A letter system of presenting the results of a ~ 
Duncan multiple range comparison of the means is illustrated for 
randomized block, split plot and lattice designs. A method is proposed 
for deciding where to enter Duncan’s tables for comparisons of split 
plot treatments between whole plots. Suggestions are given as to 
when it is desirable to present an analysis of variance and what form 
the analysis should take. Methods of presenting results of several 
variables in a single table are discussed. 


HOWARD LEVENE (Columbia University, New York, New 


- York). Association Between Blood Groups and Disease. 


A large number of studies have shown sufferers from certain diseases 
to have significantly higher frequencies of one of the ABO blood types 
than do controls. For example, duodenal ulcer patients have more O, 
suggesting that type O individuals are more susceptible to this disease. 
Various objections can be made to this inference of different suscepti- 
bility, and to help overcome some of them Clarke et al. have studied 
sibs of ulcer index cases. Using a statistical method suggested bv 
C. A. B. Smith, no significant association was found in a series of 425 
sibships. The present paper deals with a modification of Smith’s 
method that yields confidence limits for the ratio r of susceptibility 
in O to not-O individuals. It is shown that this method has certain 
optimality properties. Application of the method to the data of Clarke 
et al. yields 95% confidence limits compatible with either no association 
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(r = 1), or an association even larger than that found in the various 
population studies. 


R. C. LEWONTIN (Department of Biology, University of 
647 Rochester, Rochester, New York). A Monte Carlo Study of a 
Complex Problem in Natural Selection. 


Populations of Mus musculus are highly polymorphic for a series 
of alleles at the ¢ locus which are lethal or semilethal when homozygous. 
Heterozygous males produce unequal numbers of + and ¢ sperm, the 
proportion of ¢ varying from .90 to .99. Females have normal gametic 
ratios, however. Such a system leads to an expectation of balanced 
polymorphism with adult heterozygote frequency near .50. This is 
much higher than has been found in nature. 

The breeding structure of mouse populations makes it important 
to consider the effects of small population size. Present knowledge 
suggests that there are small family groups with some migration from 
group to group. The effect of such a breeding structure on the poly- 
morphism has been investigated using a Monte Carlo approach with 
the 650 digital computer. Parameters which were varied were: number 
of parental males and females, segregation ratio in males, and initial 
population composition. Each parameter set was run for 200 generations 
per replicate with between 56 and 119 replicates per set. The results 
could then be analyzed by standard statistical techniques for the effect 
of a change in each parameter. 

The results show that the observed polymorphism in nature is well 
explained by the assumption of a breeding unit of about 2 males and 6 
females with some migration between family groups and by occasional 
introduction of a migrant male heterozygote into an otherwise normal 
population. Such an introduction usually results in the successful 
‘fnfection” of a normal population with the mutant ¢ allele. 


648 C. C. LI (University of Pittsburgh, Pittsburgh, Pennsylvania). 
Selection When Gene Effects Are Multiplicative. 


Let r; be the effect on fitness (selection effect) of the allele A;. We 
say that the gene effects with respect to selection are multiplicative if 
the fitness of the genotype A;,A; is r,r; . With this type of selection, 
the genotypic proportions of a random mating population will conform 
with the Hardy-Weinberg law or its extensions before as well as after 
the operation of selection. This fact follows from the theorem that, 
if the terms of (q, + --- + q)” are multiplied by the corresponding 
terms of (r; + --- + 7)”, with the coefficient deleted, the result is 
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(mr: + +++ + qr)”. Hence the selection effect cannot be detected 
by merely studying the proportions of genotypes in a panmictic popu- 
lation. 


N. E. MORTON (University of Wisconsin, Madison, Wisconsin). 
Genetic Analysis with Digital Computers. 


In recent years electronic computers have been used for a number 
of genetic problems, including Monte Carlo approaches to stochastic 
processes, numerical solution of differential equations, prediction of 
successive generations under selection computation of complex prob- 
abilities in linkage, and statistical analysis of large experiments or non- 
experimental observations. In all these applications, the computer 
performs the function of a large number of desk calculators; so many, 
in favorable cases, that the analysis would not have been feasible 
without the computer. Increasing use of these machines is inevitable, 
but is attended by certain disadvantages, especially the rapid turnover 
of current computers, their high cost, and, on occasion, the difficulty 
of validating the results. 

Some applications to human genetics are described, including 
linkage analysis by probability ratio scores, utilization of vital statistics 
and large-scale medical registries, segregation analysis and its impli- 
cation for population genetics, discrimination of genetic entities, and 
control of nongenetic variables by multiple regression, especially with 
the SEGRAN, EQUIGEN, BARTAX, and MULREG programs for 
the IBM 650. 


649 


GEORGE F. POTTER, (USDA, ARS, Crops Reserve Division, 
650 Bogalusa, Louisiana). Statistical Problems Encountered by the 
Horticulturist. 


Experimental designs suitable for use by horticulturists are dis- 
cussed, and problems met in attaining a suitable degree of precision 
in experiments with horticultural material are described. Emphasis 
is placed on attaining uniformity of experimental units within the 
blocks. It is strongly recommended that the size and shape of blocks 
be such as to minimize the effects of the principal sources of error, for 
example, differences in soil or in frost hazard. 


N. R. THOMPSON (Virginia Polytechnic Institute, Blacksburg, 
651 Virginia). Hierarchical Analyses of Survey Data with the IBM 
650 Computer. 


The analysis of approximately 30,000 sets of observations on dairy 
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cows necessitated use of a high-speed digital computer. A nested or 
hierarchical analysis of variance seemed appropriate, and a computer 
program for this type of analysis was written. The program handles two 
Y-variables (milk yield and fat yield) simultaneously, and calculates 
at three levels or classifications (year, herd, state) plus “within” and 
“overall”. Output includes uncorrected sums of squares, and quantities 
to use in obtaining degrees of freedom and coefficients of variance 
components. 

This program can be extended to handle more than two Y-variables at 
a time, and also to obtain covariances. Maximum capacity of the 
program, using the IBM 650 with 2,000-word storage drum, appears 
to be five Y-variables and their co-variables. Various sub-routines, 
e.g., to calculate means and variances of sub-groups, can be added 
readily. 

Use of this program has revealed marked differences in size and 
relative importance of sources of variance. Further use is expected 
to aid in revealing genetic and non-genetic fractions of these sources. 
The program can be used in estimation of intra-class correlations 
(e.g., among half sibs) and, with the addition of steps for co-variables, 
to estimate genetic correlations. 
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ACKNOWLEDGMENT 


In Query 139 on Differential Regression answered by John T. 
Webster in Biometrics 15, 326-329, the data of Table 1 and the experi- 
mental situation described were extracted from the following reference: 


Mer, C. L. [1957]. A Re-examination of the Supposed Effect of 
Riboflavin on Growth. Plant Physiology 32, 175-185. 


We regret that: proper acknowledgment was not given in Query 
139. 
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International 


Dr. L. Martin of the Region Belgique et du Congo Belgi has been 
elected President of the Society for 1960. 

The Symposium on Quantitative Techniques in Pharmacology is 
to be held in Leiden, Holland, during May 10th-13th, 1960. Details of 
the programme can be had from H. de Jonge, Ned. Inst. voor 
Praeventieve Geneeskunde, Wassenaarseweg 56, Leiden, Holland. 

The Region Francaise announces with deep regret the death of 
Prof. G. Darmois, a past President of the Society. 

Dr. A. R. Roy has replaced Dr. K. Kishen as National Secretary 
for India. 


Deutsche Region 


The Seventh Biometric Colloquim of the German Region was held 
in Bed Nauheim during January 22nd-24th, 1960. Sessions were devoted 
to Sequential Analysis, Allometry and Variate Transformations, and 
groups of papers were given on Contagious Distributions and on analysis 
of Periodic Phenomena. 

British Region 
The following officers have been elected for 1960: 
President—K. Mather, 
Secretary—C. C. Spicer, 
Treasurer—P. A. Young. 


Two papers given at a meeting on Oct. 28, 1959 were: 


P. S. Hewlett & R. L. Plackett—Models of joint Drug Action. 
J. G. Skellam & M. D. Mountford—Simultaneous growth and 
diffusion in population dynamics. 


Following the Annual General Meeting on Dec. 15, 1959, J. O. Irwin, 
the retiring President, read a paper on the contributions of A. G. 
McKendrick to the theory of stochastic processes and their applica- 
tions in epidemiology. 
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Australasian Region 
The following officers have been elected: 
President—M. Belz, 
Secretary—W. B. Hall, 
Treasurer—G. W. Rogerson. 
ENAR 
The following officers have been elected: 


President—W. Federer, 
Secretary-Treasurer—M. Kastenbaum. 


The Region met jointly with the American Statistical Association and 
the Institute of Mathematical Statistics in Washington, D. C. during 
Dec. 27th-30th, 1959. Sessions were held on Design of Experiments, 
Multivariate Analysis, Human Genetics, Classification and Discrimina- 
tion, Ratio Estimation, Statistical Concepts and Definitions, Mathe- 
matical Models in Biology, Clinical Trials and Quantification and 
Measurement in Biology and Social Sciences. 


TWELFTH ANNUAL BUSINESS MEETING 
OF THE BIOMETRIC SOCIETY (ENAR) 


December 29, 1959 


The twelfth Annual Business Meeting of the Eastern North American 
Region of The Biometric Society was called to order at 4:00 P.M., on 
December 29, 1959, at the Shoreham Hotel, Washington, D. C. The 
presiding officer was Jerome Cornfield, the Regional President. About 
sixty persons were in attendance. 

Minutes of the Eleventh Annual Business Meeting and the 1959 
ENAR Treasurer’s Report were read and approved. A recommendation 
was made to the Regional Committee that the Palo Alto meeting with 
ASA on August 23-26 be designated the Thirteenth Annual Meeting of 
ENAR. This motion was not passed unanimously. A motion was passed 
that the study of the desirability of transferring clerical functions to 
outside agencies be continued for another year. A motion was defeated 
that the Committee To Work With Medical Men be disbanded. A 
recommendation was made to the Regional Committee that ENAR be 


& 
d 
} 
al 
q 
| 
| 
d 
| 


THE BIOMETRIC SOCIETY 141 


made one of the supporting agencies of the proposed statistics section 
of AAAS. A further recommendation was made that the Regional 
Committee approve a proposal to further develop joint relationships 
with other societies. This would involve a two to three day meeting of 
designated representatives from various societies using funds supplied 
by the Ford Foundation. A vote of thanks was tendered to Theodore 
W. Horner for his services as Secretary-Treasurer of ENAR during 
1958 and 1959. 

Walter Federer gave a report on program plans for 1960. The Society 
plans to meet with IMS at Columbia University on April 21-23; with 
AIBS at Oklahoma State in August; and with ASA at Palo Alto, Cali- 
fornia, on August 23-26. C. I. Bliss gave a report on the activities of 
his Committee to Work With Medical Men. 

Discussions were held on the desirability of transferring ENAR 
clerical functions to outside agencies, sponsors for membership applica- 
tions, relationships with medical men, and future program plans. 


Respectfully submitted, 
Theodore W. Horner 


ENAR TREASURERS REPORT FOR 1959 


by 
Theodore W. Horner, Secretary-Treasurer 


December 31, 1959 


INCOME 

Balance Forward 

Checkbook balance January 1, 1959 $578 .40 

Credit with International Treasurer 6.00 $ 584.40 
Dues payments 4,772.50 
Share of proceeds from Chicago 1953 meeting 100.17 
Payment from ASA for Pittsburgh programs 59.72 
Other Societies 23.00 
Replacement of checks not honored 14.00 
Sustaining member credit 80.00 
Funds collected for WNAR 2.00 


$5 ,635.79 
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EXPENSES 
Checks not honored $ 14.00 
Transfer of funds to WNAR 2.00 
Payment for ASA for Pittsburgh programs 59.72 
Treasurer International 4,681.00 
ENAR expenses 
Clerical $154.60 
Supplies 97 .00 
Stamps 193.10 
Printing 106.38 
Service Charges 3.70 554.78 
Checkbook balance December 31, 1959 324.29 
$5 635.79 
OPERATING ANALYSIS 
Operating income 
Dues payments $705.50 
Other societies 23.00 
Share of proceeds from ASA meeting 100.17 
Sustaining member credit 80.00 $ 908.67 
Operating Expense —554.78 
Surplus for 1959 $ 353.89 
CHECKBOOK BALANCE 
Surplus for 1959 $353 .89 
Operating Balance deficit December 1958 
as regards checkbook —29.60 
Checkbook balance—December 31, 1959 $ 324.29 
Savings account balance $ 716.11 


CALL FOR CONTRIBUTED PAPERS 


Contributed Papers for the Joint Meeting of ENAR—IMS at 
Columbia University on April 21-23, 1960 are being solicited. Titles 
and abstracts in duplicate should be sent to Dr. Boyd Harshbarger, 
Department of Statistics, Virginia Polytechnic Institute, aE, 
Virginia as soon as possible. 
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MEETINGS OF E.N.A.R. 


At the invitation of the Western North American Region E.N.A.R. 
will meet jointly with them and with the American Statistical Associa- 
tion at Stanford University, Palo Alto, California, on August 23-26, 
1960. Titles and abstracts, the latter in duplicate in the form published 
in Biometrics, of contributed papers for E.N.A.R. should be sent to 
Prof. Oscar Kempthorne, Department of Statistics, lowa State College, 
Ames, Iowa. 

E.N.A.R. will also meet jointly with the American Institute of 
Biological Sciences at Oklahoma State University in Stillwater, Okla- 
homa, on August 28-September 2, 1960. 


Programs for both these meetings will be announced later. 


Switzerland 
MEETING IN SWITZERLAND 


The international meeting (28th September—-2nd October in Berne), 
devoted to statistical methods in medical research and the pharmaceutic 
industry, organised by the Swiss Section of the Biometric Society, 
was attended by 90 participants from different european countries. 
The following papers were read; Abt (Ziirich), Kovarianzanalyse oder 
Rechnen mit Differenzen? Batschelet (Basel), Zwillingsforschung bei 
alterabhingiger Manifestation. Borth (Genf), Vergleichende Gonado- 
tropinbestimmung mit Priparaten verschiedener Herkunft. Mehrfache 
Streuungszerlegung bei ungleichen Klassenfrequenzen. Effenberger 
(Miinchen), 1) Ausbereitung der staubformigen Luftverunreinigungen 
in der Umgebung eines Grosskokswerkes. Exponentielle abhingigkeit 
des Staubfalles von der Entfernung. 2) Untersuchungen iiber die 
Schidigung des Mycobact. tuberculosis bei der Abtétung der Begleit- 
keine. Probittransformation und Streuungszerlegung. Giietli (Basel), 
Ueber die Berechung der Konstanten der Resorption und Elimination. 
Transformation von Exponentialkurven in Gerade. Berechnung von 
Vertrauensgrenzen. Hackenberg (Bielefeld), Das Rechnen im WL- 
System. Skalentransformation. Heite (Marburg), Vergleiche von Behand- 
lungen im Rechts-Links-Versuch. Ausgewogene Versuche in unvoll- 
stindigen Blécken. Kaelin (Ziirich), Auswertung von Prozentzahlen 
durch Transformation. Le Roy (Ziirich), Anwendung der Zwillingsnalyse 
fiir die Beurteilung der genetisch bedingten Merkmalsprigung. Anwen- 
dung der Streuungszerlegung und Korrelationsrechnung in der Popula- 
tionsgenetik. Linder (Genf), Abhangigkeit der Hiufigkeit der Steinbildung 
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von Geschlecht und Alter. Auswertung von Prozentzahlen durch 
Transformation. Messikommer (Basel), Die Methode von Box und 
Wilson zur Ermittlung optimaler Bedingungen. Oberhoffer (Bonn), 
Schwierigkeiten bei der Anwendung statistischer Methoden in der 
therapeutischen Erfolgsforschung. Pfanzagl (Wein), 1) Wird das 
Auftreten der bulbiren Form der Poliomyelitis durch Tonsillektomie 
begiinstigt? Exakter Test in Kontingenztafeln. 2) Steigt die Enzaphal- 
itisgefahrdung bei der subkutanen Pockenschutzimpfung nach dem 
dritten Lebensjahr mit dem Alter an? Exakter Test fiir Poissonvertei- 
lung. Rosin (Bern), Ueber die Verwertbarkeit der A-Untergruppen in 
der Gerichtsmedizin. Schwarzenbach (Bern), Eine Anwendung des 
Trennverfahrens bei der mikrobiologischen Untersuchung menschlicher 
Seren. Weber (Kopenhagen), Auswertung der Ergebnisse eines Versuches 
zur Trachombekimpfung. Probittransformation. Wegmiiller (Bern), 
Fiihrung durch das Rechenzentrum. Welle (Heidelberg), Auswertung 
von Prozentzahlen durch Transformation. 


Le Roy (Ziirich), National Secretary, Switzerland 


CHANGES IN MEMBERSHIP 
(October 1, 1959-January 15, 1960) 


Changes of Address 


Prof. W. A. Bain, Smith, Kline and French Research Institute, 
Mundells, Welwyn Garden City, Herts., England. 

Mr. Elwood L. Bombara, 52 Augusta Drive, Newark, Delaware, U.S.A. 

Dr. James Lee Cason, Department of Animal Husbandry, University 
of Maryland, College Park, Maryland, U.S. A. 

Mr. Robert T. Chatterton, Jr., Department of Animal Husbandry, 

Cornell University, Ithaca, New York, U.S. A. 

Dr. Phelps P. Crump, Department of Biometrics, School of Aviation 
Medicine, Brooks Air Force Base, Texas, U. 8. A. 

Dr. Henry E. Daniels, Department of Mathematics, The University, 
Edgbaston, Birmingham 15, England. 

Ing. Agr. Luis E. Ramirez Davila, Gral. Varela 1905, Lima, Peru, 
South America. 

Dr. Paul M. Densen, Deputy Commissioner of Health, 125 Worth 
Street, New York 13, New York, U.S. A. 

Dr. Earl L. Diamond, Division of Chronic Diseases, Johns Hopkins 
University, Baltimore 5, Maryland, U.S. A. 

Mr. Ronald Dick, 425 Madison Street, Franklin Square, New York, 
U.S. A. ‘ 
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Mr. Roger F. Diffenderfer, 32 Point Lookout, Milford, Connecticut, 
U.S. A. 

Mr. George E. Ferris, General Foods Corporation, 250 North Street, 
White Plains, New York, U.S. A. 

Prof. R. A. Fisher, University of Adelaide, Adelaide, Australia. 

Mr. Robert Fitzpatrick, 251 South 197th Street, Seattle 88, Washington, 
U.S. A. 

Mr. J.S. Gale, Department of Genetics, University of Glasgow, Glasgow 
W.2, Scotland. 

Dr. R. Gnanadesikan, 107 Passaic Avenue, Summit, New Jersey, U.S. A. 

Mr. Richard A. Greenberg, 6 Barnett Street, New Haven, Connecticut, 
U.S. A. 

Miss Glen Rae Hanémann, 369 Burkedale Boulevard, Highland Hills, 
San Antonio, Texas, U.S. A. 

Mr. Jack F. Hill, Babcock Poultry Farm, Inc., Box 286, Ithaca, New 
York, U.S. A. 

Dr. John Stuart Hunter, Apt. 6C, University Heights, Madison 5, 
Wisconsin, U.S. A. 

Dr. J. O. Irwin, London School of Hygiene, Keppel Street, London 
W.C. 1, England. 

Dr. Emil H. Jebe, Operations Research Department, University of 
Michigan, Ann Arbor, Michigan, U.S. A. 

Dr. Eileen B. Karsh, Department of Psychology, University of Penn- 
sylvania, Philadelphia, Pennsylvania, U. 8. A. 

Prof. Carl F. Kossack, IBM Research Center, P. O. Box 218, Lamb 
Estate, Yorktown Heights, New York, U.S. A. 

Dr. Wilhelm Kosswig, Bundesanst. f. Tabakforschg. b. Karlsruhe, 
Forchheim, Germany. 

Dr. H. L. Kravitz, Box 828, Burbank, California, U.S. A. 

Mrs. Katherine B. Ladd, 1618 43rd Street, S.W., Calgary, Alberta, 
Canada. 

Mr. J. M. Legay, Faculte des Sciences, Quai Claude-Bernard, Lyon, 
France. 

Mr. George I’. Lunger, P. O. Box 833, Camden 1, New Jersey, U.S. A. 

Mr. Jack A. Marshall, 7427 Choppel Avenue, Chicago 49, Illinois, 
U.S. A. 

Dr. G. G. Meynell, Lister Institute of Preventive Medicine, Chelsea 
Bridge Road, London 8.W. 1, England. 

Mr. Forest L. Miller, Jr., 606 W. Vanderbilt Drive, Oak Ridge, 
Tennessee, U.S. A. 

Dr. B. K. Mukerji, Director of Agriculture, U.P., “Hari Bhawan’, 
7 Rani Laxmi Bai Marg, Lucknow, India. 
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Dr. Gordon L. Nordby, Department of Biochemistry, Harvard Uni- 
versity Medical School, Boston 15, Massachusetts, U.S. A. 

Mr. Floyd R. Olive, c/o American Embassy, Khartoum, Sudan, Africa. 

Dr. Chai Bin Park, School of Public Health, Seoul National University, 
Seuol, Korea. 

Mr. P. A. Parsons, Department of Genetics, 44 Storey’s Way, Cam- 
bridge, England. 

Dr. James A. Rafferty, 5008 Forest Haven Drive, Alexandria, Virginia, 
U.S. A. 

Dr. Anita Rapoport, 152 Corona Avenue, Long Beach 3, California, 
U.S. A. 

Dr. K. Rintelen, Hohenzollerndamm 117, Berlin-Grunewald 1, Germany. 

Mr. Robert Roeloffs, 82 Lakeside Road, Mahopac, New York, U.S. A. 

Mr. Wilhelm Seyffert, Bot. Inst. d. Univ., Gyrhofstr. 15-17, Koln- 
Lindenthal, Germany. 

Dr. Charles E. Shelby, Regional Swine Breeding Laboratory, 32 Curtiss 
Hall, Ames, Iowa, U.S. A. 

Dr. J. V. Smart, Smith, Kline and French Laboratories, Mundells, 
Welwyn Garden City, Herts., England. 

Mr. John J. Sowinski, Armour Research Foundation, 10 West 35th 
Street, Chicago 16, Illinois, U.S. A. 

Dr. Donald F. Starr, 1830 Grand Island Avenue, Grand Island, 
Nebraska, U.S. A. 

Mr. Raul Vargas, Oficina Sanitaria Panamericana, Charcas 684, Buenos 
Aires, Argentina, South America. 

Mr. Lyle H. Wadell, Department of Animal Husbandry, Cornell Uni- 
versity, Ithaca, New York, U.S. A. 

Mr. W. G. Warren, Department of Statistics, University of North 
Carolina, Chapel Hill, North Carolina, U.S. A. 

Mr. D. R. Westgarth, Rubber Research Institute of Malaya, P. O. 
Box 150, Kuala Lumpur, Malaya. 

Mr Robert F. White, General Analysis Corporation, 11753 Wilshire 
Boulevard, Los Angeles 25, California, U.S. A. 

Mrs. Sandra S. White, 535 E. 72nd Street, New York 21, New York, 
U.S.A. 


New Members 

Belgian Region 

Mr. A. Deville, I.N.E.A.C., Nioka, Belgian Congo. 

Dr. F. Ectors, I.N.E.A.C., B.P. 46, Nioka, Belgian Congo. 


Mr. Roger Firmin, Forest Research, Ministry of Agriculture, Wad 
Medani, Sudan. 
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Mr. G. Foucart, I.N.E.A.C., Station de Rubona, Ruanda-Urundi, 
Africa. 

Mr. D. Forment, I.N.E.A.C., Nioka, Belgian Congo. 

Dr. G. E. J. Lambelin, I.N.E.A.C., B.P. 48, Nioka, Belgian Congo. 

Dr. Maritz, I.N.E.A.C., Nioka, Belgian Congo. 

Mr. J. Rossignol, I.N.E.A.C., Nioka, Belgian Congo. 

Dr. Pierre Soupart, 55 Square des Latins, Bruxelles, Belgium. 

Dr. M. Torfs, I.N.E.A.C., Nioka, Belgian Congo. 

Mr. A. Van Parijs, I.N.E.A.C., Nioka, Belgian Congo. 

Mr. R. Zwijsen, I.N.E.A.C., B.P. 58, Nioka, Belgian Congo. 


Brazilian Region 


Mr. Edurado Abramides, Instituto Agronomico, Secao Tecnica Experi- 
mental, Campinas, Sao Paulo, Brazil. 

Mr. George O’Neill Addison, Instituto de Genetica, Piracicaba, Sao 
Paulo, Brazil. 

Mr. Oswaldo Giannotti, Instituto Biologico, Caixa Postal 7119, Sao 
Paulo, Brazil. 

Mr. Jose T. do Amaral Gurgel, Instituto de Genetica, Piracicaba, Sao 
Paulo, Brazil. 

Mr. C. Coelho Andrade Lima, Caixa Postal 205, Recife, Pernambuco, 
Brazil. 

Mr. Manoel Almeida Mendes, Escola Agronomico, Cruz das Almas, 
Bahia, Brazil. 

Mr. Mario Meneghini, Instituto Biologico, Caixa Postal 7119, Sao 
Paulo, Brazil. 

Mr. Vitoria Rossetti, Instituto Biologico, Caixa Postal 7119, Sao 
Paulo, Brazil. 

Mr. Jose Correia de Vasconcellos, Escola de Agronomia, Areia, Paraiba, 
Brazil. 

Mr. Roland Vencovsky, Instituto de Genetica, Piracicaba, Sao Paulo, 
Brazil. 

Mr. Wanderley Rinaldo Venturini, Instituto Agronomico, Secao Tecnica 
Experimental, Campinas, Sao Paulo, Brazil. 


British Region 

Mr. G. W. Bonsall, Rothamsted Experimental Station, Harpenden, 
Herts., England. 

Mr. S. F. Buck, Statistics Department, Rothamsted Experimental 


Station, Harpenden, Herts., England. 
Dr. M. G. Bulmer, Biometry Unit, 7 Keble Road, Oxford, England. 
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Mr. A. T. Dunn, Glaxo Laboratories, Seftan Park, Stoke Poges, 
Buckinghamshire, England. 

Mrs. M. McDermott, Llandough Hospital, Penarth, Glamorganshire, 
England. 

Dr. P. G. Moore, Messrs. A. E. Reed and Company, Malling House, 
West Malling, Kent, England. 

Mr. E. R. Muller, Department of Statistics, University of Aberdeen, 

Old Aberdeen, Scotland. 

Mr. Charles E. Rossiter, 11 Stacey Road, Dinas Powis, Cardiff, Glam., 
Wales. 

Dr. M. Stack-Dunne, M.R.C. Control Laboratories, Holly Hill, Hamp- 
stead, London N.W. 3, England. 

Miss E. Tate, Biological Control Laboratories, Acacia Hall, Dartford, 
Kent, England. 

Mr. Jacob Thomas, Llandough Hospital, Penarth, Glamorganshire, 
England. 


Eastern North American Region 


Mr. Raymond R. Allmaras, Department of Agronomy, Iowa State 
University, Ames, Iowa, U.S. A. 

Mr. Gary D. Bearden, Biometrical Service, ARS, Agriculture Research 
Center, Beltsville, Maryland, U.S. A. 

Mr. Roger L. Bollenbacher, Statistical Laboratory, Purdue University, 
W. Lafayette, Indiana, U.S. A. 

Dr. Oscar K. Buros, Professor of Education, Rutgers University, New 

Brunswick, New Jersey, U.S. A. 

Mr. Steve A. Eberhart, 706 E. Whitaker Mill Road, Raleigh, North 
Carolina, U. 8S. A. 

Mr. Jacob N. Eisen, 4104 Tulare Drive, Silver Spring, Maryland, 
U.S. A. 

Mr. Andris Fogelmanis, 566 Pammel Court, Ames, Iowa, U.S. A. 

Mr. Gerald Friars, Department of Poultry Science, Purdue University, 
W. Lafayette, Indiana, U.S. A. 

Mr. Donald K. Hotchkiss, Curtiss Hall, Iowa State College, Ames, 
Iowa, U.S. A. 

Mr. Samuel Hung, 53A Orchard Street, Cambridge 40, Massachusetts, 
U.S. A. 

Dr. Keith Huston, Department of Dairy Husbandry, Kansas State 
University, Manhattan, Kansas, U.S. A. 

Mr. Don Jenson, Soil Testing Laboratory, Iowa State University, 
Ames, Iowa, U.S. A. , 
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Mr. Cecil L. Kaller, Statistical Laboratory, Purdue University, 
Lafayette, Indiana, U.S. A. 

Mr. Thomas R. Konsler, Department of Experimental Statistics, 
North Carolina State College, Raleigh, North Carolina, U.S. A. 

Mr. Hugo E. Mayer, Jr., 271 Stiles Street, Elizabeth 3, New Jersey, 
U.S. A. 

Mr. Clinton Miller, University of Oklahoma, Medical Center, Bio- 
statistical Unit, Oklahoma City, Oklahoma, U.S. A. 

Mr. Charles E. Redman, Poultry Department, Peters Hall, University 
of Minnesota, St. Paul, Minnesota, U.S. A. 

Miss Rose Sachs, 2124 Pennsylvaina Avenue, N.W., Washington 7, 
D. C., U.S. A. 

Mr. Wilfred Salhuana, 2709 Mayview Road, Raleigh, North Carolina, 
U.S. A. 

Mr. Oliver B. Schnenk, 608 Second Street, $., Minneapolis 2, Minnesota, 
U.S. A. 

Prof. Robert S. Temple, Animal Industry Department, Louisiana 
State University, Baton Rouge, Louisiana, U.S. A. 

Mr. William M. Walker, Agronomy Department, Iowa State Univer- 
sity, Ames, Iowa, U. S. A. 

Mrs. Sandra S. White, 1360 York Avenue, New York 21, New York, 
U.S. A. 

Mr. Marvin Zelen, 401 South Building, National Bureau of Standards, 

’ Washington 25, D. C., U.S. A. 


German Region 


Dipl. Math. R. J. Lorenz, Waldhauser Hohe, Tubingen (Neckar) 
Germany. 


India 


Mr. M. V. Pavate, Central Tobacco Research Institute, Rajamundry 
(A.P.), India. 

Dr. A. R. Roy, Department of Statistics, Lucknow University, Lucknow, 
India. 

Mr. Tenneti Viswanatham, Chairman, Coffee Board, P.B. No. 2, 
Bangalore-9, India. 


Italian Region 

Dr. Columbia Aldo, Clinica Neurologia, Universita di Pavia, Pavia, 
Italy. 

Dr. Brunetto Chiarelli, Instituto Genetica, Via 8. Epifanio 14, Pavia, 
Italy. 
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Dr. Enrico Gandini, c/o C.N.T.S., Via Ramazzini 15, Rome, Italy. 

Dott. Franco Malossini, Instituto Sperimentale Zootecnico, Montero- 
tondo-Scalo, Rome, Italy. 

Dr. Giuliana Moldifassi, Viale Oberdan 6, Pavia, Italy. 

Dr. Luciana Quagliotti, Instituto di Agronomico, Via Michelangelo 32, 
Torino, Italy. 

Dr. Auxilia Maria Teresa, Istituto Zootecnico, Via Piomezza 115, 
Torino, Italy. 


Western North American Region 


Dr. A. Richman, Faculty of Medicine, V.B.C., Tenth and Heather, 
Vancouver 13, British Columbia, Canada. 
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NEWS AND ANNOUNCEMENTS 


Members are invited to transmit to their National or Regional Secretary 
(if members at large, to the General Secretary) news of appointments, 
distinctions, or retirements, and announcements of professional interest. 


Edwin J. deBeer 
1902-1959 


The untimely death of Edwin J. deBeer, on October 27 following a 
protracted illness, took from the Society a founding member. Born in 
Portsmouth, Ohio, in 1902, Dr. deBeer was trained in biological chemis- 
try at the University of Pennsylvania. He remained on the teaching 
staff of the University for a year following completion of his graduate 
work in 1932 before joining the staff of Burroughs-Wellcome and Com- 
pany at Tuckahoe, New York, where, at the time of his death, he was 
Associate Director of Research and had served for long periods as 
Acting Director of Research. 

Like many young biochemists at that time, Dr. deBeer’s work at 
Burroughs-Wellcome led him into pharmacology and it was his nature 
to see very quickly the great need for biometric techinques in his new 
and rapidly expanding field. By self-training and inquiry, he mastered 
those branches of biometry that he required to the point where he 
devised simple, graphical methods for the analysis of all-or-none data 
and dose-time reactions. These methods came fortuitously at a period 
when others like him felt the need for reasonably good approximate 
procedures of this kind. 

Dr. deBeer demonstrated his organizing ability during the mid- 
thirties by bringing a group of pharmacologists together for study and 
discussion of biometric methods. This experience resulted in his organ- 
izing in 1949 a highly successful conference for the New York Academy 
of Sciences under the title, The Place of Statistical Method in Biological 
and Chemical Experimentation. He also served for several years as a 
member or chairman of a Biometric Society committee to arrange 
sessions on the application of simple and advanced biometric techniques 
to pharmacologic data that were held at the annual meeting of the 
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Federation of American Societies for Experimental Biology. Probably 
the greatest direct benefit from these joint conferences was the demon- 
stration that no modern pharmacologic research program should be 
attempted without the services of an expert in biometry. 

While his major contributions to science and medicine were in 
pharmacology, rather than in biometry, Dr. deBeer was well known to 
all the group that met at Woods Hole in 1948 to found the Biometric 
Society. His sustained interest in and support of the Society’s activities 
will be missed. 

He was a fellow of the American Association for the Advancement 
of Science, the Royal Society of Tropical Medicine, and a Fellow and 
Council Member of the New York Academy of Science. He was a 
member of the Regional Committee of ENAR, 1946 to 1948 and 1956— 
1958, and was a member of the Regional Advisory Board from 1950-1952. 


NEW SECTION ON BOOK REVIEWS 


Biometrics plans a new section on book reviews. A committee con- 
sisting of C. I. Bliss, L. L. Cavalli-Sforza, H. W. Norton, and 8S. C. 
Pearce has developed a policy statement on book reviews and selected 
a review editor. The review editor will be J. G. Skellam. 

Biometrics welcomes for reviewing, or listing, all statistical and 
mathematical books with applications in biology or containing matter 
of potential importance in the biological science. Included are both 
general and specialized books, monographs in which a mathematical 
or statistical approach is the key to a biological conclusion, statistical 
and numerical tables, and, in general, all books which explain statistics 
and mathematics to the biologist or biology to the statistician and 
mathematician. 

Books intended for submission to be sent to the Review Editor: 


Mr. J. G. Skellam 

The Nature Conservancy 
19 Belgrave Square 
London, S. W. 1. England. 


SUMMER OFFERINGS IN STATISTICS AT THE 
IOWA STATE UNIVERSITY 


The Department of Statistics at lowa State University will offer six 
applied courses in statistical theory and methods in its two 1960 summer 
sessions. These courses are planned primarily for graduate students or 
research workers with limited mathematical backgrounds who wish 
to use statistical techniques intelligently for application to other fields. 
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In addition, courses in special topics in theoretical or applied statistics 
may be studied at the graduate level. Senior staff members will be 
available during most of the summer for consultations on research or 
special problems. 

Students may register for either or both of the six-week summer 
sessions: June 6-July 13 and July 13—August 19. The complete list of 
statistics offerings for the first session is as follows: Stat. 401, Statistical 
Methods for Research Workers (at the level of Snedecor’s Statistical 
Methods); Stat. 447, Statistical Theory for Research Workers (mainly 
theory of experimental statistics at the level of Anderson and Bancroft’s 
Statistical Theory in Research); Stat. 599, Special Topics; and Stat. 
699, Research. In the second session will be offered Stat. 402, a con- 
tinuation of 401; Stat. 448, a continuation of 447; two courses in applied 
methods which are more specialized—Stat. 411, Experimental Designs 
for Research Workers, and Stat. 421, Survey Designs for Research 
Workers; and finally Stat. 599 and 699. Additional information may 
be obtained from T. A. Bancroft, Department Head and Director, 
Statistical Laboratory, Iowa State University. 


SOUTHERN REGIONAL GRADUATE SUMMER SESSION 
IN STATISTICS AT UNIVERSITY OF FLORIDA 


The 1960 Southern Regional Graduate Summer Session in Statistics 
will be held at the University of Florida at Gainesville from June 20 to 
July 29, 1960. The University of Florida, North Carolina State College, 
Virginia Polytechnic Institute and Oklahoma State University have 
agreed to operate a continuing program of graduate summer sessions 
in statistics to be held at each institution in rotation. 

It is the purpose of this program to serve: (1) teachers of intro- 
ductory statistical courses and college teachers of mathematics who 
want formal training in modern statistics; (2) research and professional 
workers who want intensive instruction in basic statistical concepts 
and modern statistical methodology; (3) professional statisticians 
who wish to keep informed about advanced specialized theory and 
methods; (4) prospective candidates for graduate degrees in statistics; 
and (5) graduate students in other fields who desire supporting work in 
statistics. 

The session will last six weeks and courses will carry three semester 
hours of credit. Not more than two courses may be taken for credit 
at any one session. The summer work in statistics may be applied as 
residence credit at any one of the cooperating institutions, as well as 
certain other universities, in partial fulfillment of the requirements for 
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a graduate degree. The program may be entered at any session, and 
consecutive courses will follow in successive summers. 

The courses to be offered in statistics in 1960 at the University of 
Florida are as follows: Statistical Methods I, II and III, through 
Sample Survey Methods; Statistical Theory I, II and III, including 
Probability, Inference and Least Squares; Statistical Problems; Ad- 
vanced Statistical Inference; and Response Surfaces. In addition, a 
number of courses in the Mathematics Department will be available. 

The National Science Foundation is making available to the Univer- 
sity of Florida grants for college teachers of statistics and college 
teachers of mathematics who wish to attend the 1960 session. Appli- 
cants for these grants should be employed by an institution of higher 
learning as a teacher of mathematics or statistics; those from insti- 
tutions wherein there is no opportunity for formal training in modern 
inferential statistics and probability will be given priority. Applica- 
tions for grants should be postmarked not later than February 15, 1960 
to be assured of full consideration. 

Requests for application blanks for the summer session and for 
National Science Foundation grants should be addressed to Dr. Herbert 
A. Meyer, Statistical Laboratory, University of Florida, Box 3568, 
Gainesville, Florida. 


SUMML® SESSION AT THE UNIVERSITY OF MINNESOTA 


The 1960 Graduate Summer Session of Statistics in the Health 
Sciences, sponsored by the accredited Schools of Public Health of the 
United States under a research training grant from the Division of 
General Medical Sciences of the National Institutes of Health of the 
United States Public Health Service, will be held at the University of 
Minnesota, School of Public Health, in Minneapolis, Minnesota from 
June 16 to July 30, 1960. A limited number of fellowships are available. 
For information, write Prof. J. E. Beraman, Univ. of Minnesota, 
Minneapolis 14, The faculty will include: Dr. F. M. Hemphill, Dr. 
Boyd Harshbarger, Dr. Robert B. Reid, Dr. Albert E. Bailey, Mr. 
Carol L. Erhardt, Dr. Chin Long Chiang, Dr. Bernard Greenberg, 
Dr. Donovan Thompson, and Dr. Eugene A. Johnson. 

Courses which will be taught are: Statistical Methods in Public 
Health; Management of Health Agency Records; Biostatistics in the 
Health Sciences; Demographic Methods in Public Health; Registration 
and Vital Records; Advanced Biostatistics in the Health Sciences; 
Statistical Methods in Epidemiology; Sampling Techniques in the 
Health Sciences; Statistical Methods in Biological Assay; Lecture 
Series. 
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INFORMATION ON SCIENCE IN THE U‘S.S.R. 


The Pergamon Institute, a non-profit foundation, was formed in Wash- 
ington, D. C. in 1957 for the purpose of making available to English- 
speaking scientists, doctors and engineers (from all countries that are 
members of the United Nations), the results of scientific, technological 
and medical research and development in the Soviet Union and other 
countries in the Soviet orbit. The Institute maintains offices in Wash- 
ington, London, and Oxford. The Pergamon Institute is planning a 
series of review volumes, Russian-English scientific and technical 
dictionaries and glossaries, and now publishes a number of selected 
Russian journals in an English translation. The Institute provides 
listing and abstracting services, biographical services, critical evalua- 
tion of Russian publications, books, journals, and microfilms and 
translations. The Institute is in need of qualified scientists, technologists 
and doctors with a knowledge of the Russian language who would be 
able and willing to undertake translation work on a spare-time basis 
for which they would be remunerated. Additional information may 
be obtained from the Pergamon Institute, 1404 New York Avenue, 
Northwest, Washington 5, D. C., U.S.A. 


MATHEMATICS OF COMPUTATION 


The publication, Mathematical Tables and Other Aids to Computation, 
has changed its name to Mathematics of Computation. The new name 
reflects the broadened scope of the journal which has expanded to 
meet the need in the U.S.A. of a publication devoted to numerical 
analysis and computation. The address for the journal is as follows: 
Printing and Publishing Office, National Academy of Sciences, National 
Research Council, 2101 Constitution Avenue, Washington 25, D. C., 
U.S.A. 


NEWS ABOUT MEMBERS 
ENAR 


James L. Cason, formerly in the Dairy Department at Rutgers 
University, has taken the position of Associate Professor in Dairying 
at the University of Maryland. 

Robert T. Chatterton, Jr., is a Graduate Assistant in the Department 
of Animal Husbandry at Cornell University. Mr. Chatterton was 
formerly Research Assistant in the Department of Animal Industry 
at the University of Connecticut. 

Phelps P. Crump has left the Brookhaven National Laboratory 
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to take the position of Supervisory Mathematical Statistician at the 
School of Aviation Medicine, Brooks Air Force Base, Texas. 

Constance E. Cox who is at present Head of the Biometrics Section 
of the Food and Drug Directorate in the Department of National 
Health and Welfare, Canada, will be on loan to the Colombo Plan for 
at least one year beginning in January, 1960. She will be taking the 
post of Lecturer in Statistics in a new academy which has been set up 
in Djakarta, Indonesia, to train statisticians and economists for govern- 
ment service. 

Gertrude M. Cox has resigned as Director, Institute of Statistics 
of The Consolidated University of North Carolina. She will continue 
for the present as Professor of Statistics at North Carolina State College. 
Dr. Cox assumed a new position as Head, Statistics Research Division, 
the Research Triangle Institute on January 1, 1960. 

Paul M. Densen was sworn in as Deputy Commissioner of the New 
York City Department of Health on Thursday, December 3, 1959, 
by Mayor Robert F. Wagner. Dr. Densen came to the Health Depart- 
ment from the Health Insurance Plan of Greater New York where he 
was Director of the Division of Research and Statistics since 1954. 

Earl L. Diamond, formerly of the School of Public Health, University 
of North Carolina, has taken the position of Assistant Professor of 
Public Health Administration and Assistant Professor of Biostatistics 
at Johns Hopkins University. 

Charles F. Federspiel received his Ph.D. in Biostatistics at the Uni- 
versity of North Carolina and is presently Assistant Professor of 
Biostatistics in the Department of Preventive Medicine at Vanderbilt 
University. 

George E. Ferris is presently a Market Research Associate for the 
General Foods Corporation, White Plains, New York. 

R. Gnanadesikan has left the Procter and Gamble Company to 
become a member of the Technical Staff, Mathematics and Mechanics 
Research Department of the Bell Telephone Laboratories, Murray 
Hill, New Jersey. 

Leo A. Goodman, Professor of Statistics and Sociology at the Uni- 
versity of Chicago has heen awarded a Senior Postdoctoral Fellowship 
by the National Science Foundation and a Fellowship by the John 
Simon Guggenheim Memorial Foundation; he is now at the Statistical 
Laboratory of the University of Cambridge, Cambridge, England, on 
leave of absence from the University of Chicago. 

Richard A. Greenberg has left his position as Research Statistician 
for the Connecticut State Department of Health to become a Graduate 
Student at Yale University. 
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Jack F. Hill has taken the position of Director of Research at the 
Babcock Poultry Farm, Inc., in Ithaca, New York. 

J. Stuart Hunter is now Associate Professor at the Mathematics 
Research Center, University of Wisconsin. Dr. Hunter was formerly 
a Research Associate in the Statistical Techinques Research Group 
at Princeton University. 

J. Edward Jackson completed all his requirements for a Ph.D. degree 
at VPI and is presently Senior Analyst in Management Systems Devel- 
opment Department at Eastman Kodak Company at Rochester, 
New York. 

Emil H. Jebe has taken the position of Research Mathematician, 
Operations Research Department, Willow Run Laboratories, University 
of Michigan. Dr. Jebe was formerly Associate Professor of Statistics 
at the Iowa State University in Ames, Iowa. 

Carl F. Kossack has left his position as Head of the Mathematics 
Department at Purdue University to become Manager of Operations 
Research and Statistics Department of the International Business 
Machines Corporation. 

Katherine B. Ladd, formerly Assistant Professor of Biostatistics 
in the College of Medicine, University of Vermont, Burlington, Vermont, 
has taken the position of Statistician with the Ontario Department of 
Health, Division of Maternal and Child Health. 

George F. Lunger is presently Senior Methods Specialist, Advanced 
Programming Activity, Electronic Data Processing, Industrial Elec- 
tronic Products Division of the Radio Corporation of America in 
Camden, New Jersey. He was formerly with the UNIVAC Division 
of the Sperry Rand Corporation. 

Forest L. Miller has left the National Bureau of Standards to become 
a Statistician for the Union Carbide Nuclear Company in Oak Ridge, 
Tennessee. 

G. B. Oakland, formerly Chief of the Statistical Research and Services, 
Canada Department of Agriculture, has been appointed Senior Research 
Statistician, The Dominion Bureau of Statistics, Ottawa, as of January 
1, 1960. 

Peter H. Ovenburg is an Instructor in the Department of Zoology 
at Michigan State University, East Lansing, Michigan. 

Charles E. Shelby has taken the position of Director of the Regional 
Swine Breeding Laboratory, Animal Husbandry Research Division, 
at the Iowa State University in Ames, Iowa. 

Lyle H. Wadell received his Ph.D. in November from the Iowa 
State University and is presently Research Associate in the Department 
of Animal Husbandry, Cornell University in Ithaca, New York. 
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Jack A. Marshall is a graduate assistant in the Department of 
Statistics at the University of Chicago, Chicago, Illinois. 

Kamini M. Patwary has recently completed the requirements for 
the Ph.D. degree in Statistics at the American University, Washington, 
D. C. He has been appointed as “Statistician” to the World Health 
Organization at Geneva. In this position he will advise on uses of 
statistical techniques for health surveys in different parts of the world. 
Prior to this appointment he was an Assistant Professor at Howard 
University and a part time instructor of Mathematical Statistics at 
the University of Maryland. 

Donald J. Rosania was recently promoted to Technical Services 
Supervisor for the Toni Company Research Laboratories in Chicago. 
As well as other duties in this capacity, he will direct the Research 
Statistics Department. 


INTERNATIONAL JOURNAL OF ABSTRACTS 
STATISTICAL THEORY AND METHOD 


The aim of this new Journal is to give complete coverage of papers in the field of 
statistical theory (including associated aspects and probability and other mathe- 
matical methods) and new contributions to statistical me as published after 
Ist October 1958. 


All contributions in the following five journals—being wholly devoted to this 
field—will be abstracted: Annals of Mathematical Statistics; Biometrika; Journal, 
Royal Statistical Society (Series B); Bulletin of Mathematical Statistics; Annals, 
Institute of Statistical Mathematics; and a further group of six journals will be 
abstracted on a virtually complete basis as follows: Biometrics; Metrika; Metron; 
Review, International Statistical Institute; Technometrics; Sankhyad. There are 
about 250 other journals partly devoted to statistical theory and method from 
which the appropriate papers will be abstracted. 

A scheme of classification has been developed for the abstracts that is flexible 
and facilitates the transfer of code numbers to punched-cards. A unique of 
this Journal is that the pages are colour-tinted according to the main sections of 
classification. This method of colour-coding the pages provides a distinctive and 

werful visual aid in the identification of abstracts in whatever manner the 
Journal is filed for reference. 

The abstracts will be about 400 words long—the recommendation of UNESCO 
for the “long” abstract service—and will be in the English language. This new 
Journal will be quarterly and contain approximately 1000 abstracts per year. 

Annual Subscription £5 (U.S.A. & CANADA $16.00) 
Single Number 30s. (U.S.A. & CANADA $ 4.50) 


OLIVER AND BOYD LTD. 
Tweeddale Court, 14 High Street, Edinburgh, | 
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TABLE OF CONTENTS 
Tables for the Sign Test when Observations are Estimates of Binomial 


Problems in Estimating Federal Government Expenditures. SamuEL M. Coun 


Analysis of Vital Statistics by Census Tract 
EvizaBEtu J. COULTER AND LILLIAN GURALNICK 


Matrix Inversion, Its Interest and Application in Analysis of Data 
S. G. GREENBERG AND A. E. SARHAN 


The Lady Tasting Tea, and Allied Topics.............. N. T. GripGeMaAN 
A Check on Gross Errors in Certain Variance Computations Hyman B. Katrz 
Automatic Programming for Automatic Computers..... MrtcHeE.t O. Locks 
Comparison of Estimates of Circular Probable Error... .. Paut B. MoranpbAa 
A Note of Mean Square Successive Differences............... J. N. K. Rao 
A Multiple Comparison Sign Test, Treatment vs. Control. ..R. G. D. STEEL 


BOOK REVIEWS 
The American Statistical Association Invites as Members All Persons Inter- 
ested in: 
1. Development of new theory and method 
2. Improvement of basic statistical data 
3. Application of statistical methods to practical problems 


AMERICAN STATISTICAL ASSOCIATION 
1757 K Street, N. W., Washington 6, D. C. 


For further information, please contact the American Statistical Association, 
1757 K Street, N. W., Washington 6, D.C. 
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A Journal of Statistics for the 
Physical, Chemical, and Engineering Sciences 


Vol. 1, No. 3 August, 1959 
CONTENTS 

Simplified Estimators for the Normal Distribution when Samples are Singly 

Censored or A, Cuirrorp CouHEn, JR. 

Control Chart Tests Based on Geometric Moving Averages. .S. W. Roperts 

Factorial Experiments in Life Testing..................... Marvin ZELEN 


The Use of LaGrange Multipliers with Response Surfaces 
A. W. Umuanp anp W. N. Sarre 


A Statistical Model for Evaluating the Reliability of Safety Systems for 
Plants Manufacturing Hazardous Products............ Lours B. Kaun 


Vol. 1, No. 4 November, 1959 


CONTENTS 


Use of Half-Normal Plots in Interpreting Factorial Two Level Experiments 
CuTHBeRT DANIEL 


On the Analysis of Factorial Experiments without Replication 
ALLAN BirnBAUM 


Quality Control Methods for Several Related Variables J. E>warp Jackson 


Analysis of Latin Squares within Certain Type of Row-Column Interaction 
Joun MANDEL 


A Graphical Estimation of Mixed Weibull Parameters in Life Testing Electron 
Evaluation of Chemical Analyses on Two Rocks............ W. J. YoupEn 


Technometrics is published quarterly in February, May, August, and 
November. The annual non-member subscription rate is $8.00. To members 
of the American Statistical Association and the American Society for Quality 
Control the rate is $6.00. Inquiries should be addressed to either Technometrics, 
American Statistical Association 404, Beacon Bldg., 1757 K Street, N. W., 
Washington 6, D. C. or Technometrics, American Society for Quality Control, 
Rm. 6197, Plankinton Bldg., 161 Wisconsin Ave., Milwaukee 3, Wisconsin. 
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INFORMATION FOR CONTRIBUTORS 


Manuscripts 


Contributions for Biometrics may be addressed to Dr. Ralph A. Bradley, Depart- 
ment of Statistics, The Florida State University, Tallahassee, Florida, U.S.A.; 
authors residing in the following Society Regions can expedite consideration of papers 
by submitting them to the appropriate Associate Editor, namely; BRITISH RE- 
GION: Dr. 8. C. Pearce, East Malling Research Station, East Malling, Maidstone, 
Kent, England; AUSTRALASIAN REGION: Dr. E. A. Cornish, University of 
Adelaide, Adelaide, Australia; FRENCH REGION: Dr. Georges Teissier, Faculté 
des Sciences de Paris, 1 rue V. Cousin, Paris, France. QUERIES, NOTES, and 
related correspondence should be directed to Dr. D. J. Finney, Department of 
Statistics, University of Aberdeen, Meston Walk, Old Aberdeen, Scotland. Books 
and material for BOOK REVIEWS should be sent to Mr. J. G. Skellam, The Nature 
Conservancy, 19 Belgrave Square, London, 8.W. 1, England. 

MANUSCRIPTS must be submitted in triplicate, with typescript doublespaced 
throughout. Marginal notes may obviate typographical difficulties presented by 
complicated formulae or tables—authors should not attempt editorial instructions 
or markings for the printer. TABLES should be identified by arabic number and 
by a short descriptive title. ILLUSTRATIONS should also be identified by arabic 
number and by a brief caption. (Captions should not be included in illustrations, 
but should be typewritten collectively on an accompanying sheet.) Originals 
should be approximately 8.5 x 11 in. (21.5 x 28 cm.). The original of each chart, 
diagram, or graph should be executed in black on white drawing paper or board, on 
blue tracing linen, or on coordinate paper ruled in blue only; coordinate lines to be 
reproduced should be ruled in black. For printing, illustrations may be reduced to 
¥ or ¥ original dimensions, Lines should therefore be of sufficient thickness, and 
decimal points, periods, and stippled dots should be solid black circles large enough 
to reproduce well. Lettering and numerals should be at least 1 mm. high when 
reproduced in a cut 3 in. (7.5 cm.) wide. Photographs should be prints on glossy 
paper with strong contrasts, and if grouped in a plate should be mounted contig- 
uously. All tables and illustrations should be mentioned explicitly in the text. 
REFERENCES (BIBLIOGRAPHIC) should be collectively listed alphabetically 
by author; textual citation by author and year is preferred. 


ABSTRACTS 


Abstracts of papers presented at meetings of the Biometric Society or of its 
regions are printed in Biometrics following such meetings. They should be submitted 
to the person designated to receive them for a particular meeting in exactly the form 
published in Biometrics (except for an Abstract Number), doublespaced on bond 
paper, and in duplicate. Use of formulae requiring display printing is to be avoided. 


ANNOUNCEMENTS, AND Biometric Reports 


International and regional reports and notices should be submitted by the 
appropriate officers of the Society and its Regions in duplicate doublespaced on 
separate sheets exactly as they are to be printed in Biometrics. Other material to 
be printed in News and Announcements should also be submitted doublespaced 
and in duplicate. 


Sustaining MemBers oF THE Biometric Society 


Abbott Laboratories 
American Cancer Society, Inc. 
General Foods’ Corporation, Research Center 
Heisdorf and Nelson Farms, Inc. 

Merck, Sharp and Dohme Research Laboratories 
Schering Corporation 

Smith, Kline and French Laboratories 

E. R. Squibb and Sons 

The Upjohn Company 

Wallace Laboratories, Division of Carter Products 
Wyeth Institute of Applied Biochemistry 
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BACK ISSUES 


Back issues of Biometrics are available at the following postage-paid 
prices in U.S.A. currency: 


Price per Price per 
Volume Number’ Single Number Volume(unbound) 


1 to6 $1. 
1 to 6 
1to4 
lto4 
lto4 
lto4 
lto4 
lto4 
1to4 
lto4 
lto4 
1to4 
lto4 
1lto4 
1lto4 
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Reprints of individual articles are not available except to authors at the 
time of printing. Three special issues are among the numbers listed 
above. They are: 


1947 Volume 3 Number 1 The Analysis of Variance 
1951 Volume 7 Number 1 Components of Variance 
1957 Volume 13 Number 3 The Analysis of Covariance 


Also available are: 
Fishery Reprint Series (Selected reprints from Vol. 5). $1.00 
Subject Index (Volumes 1-10) 1.00 
Proceedings, International Biometric Symposium, 
Campinas, Brazil, 1955. 1.00 


Inquiries, non-member subscriptions, and orders for back issues and 
other material listed above should be addressed to: Brommrrics, DEpart- 
MENT OF Statistics, THE Fiorma Strate UNIvERsITY, TALLAHASSEE, 
Frorma, U.S.A. 
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